Sublime Forum

Text file to Columns (as .csv or .xls) Using Sublime

#1

I am new to it all. Please advise and be patient. I feel like I am learning how to walk!

I have a text file (see clip below) that includes lines of headers and the related info separated by a colon.
I would like to convert this text file into columns in csv or excel format ( same difference to me-not sure it matters).
I have taken multiple tutorials, have attempted to download and use pandas - but I keep on hitting a roadblock. As such, I am starting fresh and would like to ask an expert - what steps would you take to convert this file to csv or excel using Sublime. (the latest edition I just purchased).

is there a package that does this kind of thing easily and I just have not found it?

My text file looks like this (and has thousands of entries). I want to get each header (Author, Title, Publication Year, etc…) in its own column in csv or excel format.

  1. Author(s): Institute of Contemporary British History.
    Title**: Abbreviation: 20 Century Br Hist
    Title(s): 20 century British history.
    Publication Start Year: 1990
    Publication End Year:
    Frequency: 4 no. a year, 1999-
    Country of Publication: England
    Publisher: Eynsham, Oxford : Oxford University Press, [1990-
    Description: v.
    Language: English
    ISSN: 0955-2359(Print); 1477-4674(Electronic); 0955-2359(Linking)
    LCCN: 92-640104, sn 90-31375
    Electronic Links: , https://academic.oup.com/tcbhOxford University Press
    Selectively Indexed in: MEDLINEv1n2, 1990-PubMedv1n2, 1990-
    Current Indexing Status: Currently indexed for MEDLINE.
    Current Subset: History of Medicine (non-Index Medicus)
    MeSH: History*; United Kingdom
    Publication Type(s): Periodical
    Notes: Issues for 2004- have title: 20th century British history. Title from cover. Also
    issued online. Founded by the Institute of Contemporary British History.
    Other ID: (DNLM)SR0067814(s)(OCoLC)22481516
    NLM ID: 9015384[Serial]

  2. Author(s): International Anesthesia Research Society,
    Title Abbreviation: A A Pract
    Title(s): A&A practice.
    Publication Start Year: 2018
    Publication End Year:
    Frequency: Biweekly
    Country of Publication: United States
    Publisher: [Philadelphia, PA] : Wolters Kluwer Health, Inc., [2018]-
    Description: 1 online resource
    Language: English
    ISSN: 2575-3126(Electronic); 2575-3126(Linking)
    LCCN: 2017203165
    Electronic Links: , https://ovidsp.ovid.com/ovidweb.cgi?T=JS&MODE=ovid&PAGE=toc&D=ovft&AN=02054229-000000000-00000Ovid - NIH (Journals@OVID)
    Fully Indexed In: MEDLINE: v10n1, Jan. 1 2018-
    In: PubMed: v10n1, Jan. 1 2018-;
    Current Indexing Status: Currently indexed for MEDLINE.
    Current Subset: Index Medicus
    MeSH: Anesthesiology*
    Publication Type(s): Periodical
    Notes: Complemented by (work): Anesthesia & analgesia 0003-2999 (DLC) 2004265703
    (OCoLC)1481131 (DNLM)1310650.
    Other ID: (OCoLC)1004852152
    NLM ID: 101714112[Serial]

Thank you in advance for your expertise and kindness.

0 Likes

#2

as a hint to get you started, and assuming each entry has all the same headings in the same order, perhaps a regex replacement for ^[^:\n]+: ? to nothing, then from \n to , or \t depending on your preferred CSV separator and if the fields contain those chars. You might need a bit of extra work for the 1., 2. etc to be a record separator, not sure.

0 Likes