Clean export of the references in your Bibliography
Like many scientific writers, I use Mendeley to collect all my references in one single (and often messy) .bib file. For this reason, the necessity to clean and customize in a document-driven manner all the entries arises to improve the readability and the efficiency of your LaTex projects. I found a simple solution that might help many of you by following few steps.
First of all, the following python program clean_bib.py pops out all the undesired fields from the bibtex entries. The full documentation of the program can be found at the link https://github.com/zohannn/clean_bib.git.
import datetime import bibtexparser from bibtexparser.bparser import BibTexParser from bibtexparser.bwriter import BibTexWriter from bibtexparser.customization import * input_b = "library.bib" output_b = "lib_unused_fields.bib" now = datetime.datetime.now() print("{0} Cleaning duff bib records from {1} into {2}".format(now, input_b, output_b)) # Let's define a function to customize our entries. # It takes a record and return this record. def customizations(record): """Use some functions delivered by the library :param record: a record :returns: -- customized record """ record = type(record) record = page_double_hyphen(record) record = convert_to_unicode(record) ## delete the following keys. unwanted = ["archivePrefix","arxivId","doi", "url", "abstract", "file", "gobbledegook", "isbn", "link", "keyword","keywords", "number","mendeley-tags", "annote", "pmid", "chapter", "institution", "issn", "month"] ## unwanted = ["url", "abstract", "file", "gobbledegook", "isbn", "link", "keyword","keywords", "number","mendeley-tags", "annote", "pmid", "chapter", "institution", "issn", "month"] for val in unwanted: record.pop(val, None) return record bib_database = None with open(input_b) as bibtex_file: parser = BibTexParser() parser.customization = customizations parser.ignore_nonstandard_types = False bib_database = bibtexparser.load(bibtex_file, parser=parser) if bib_database : now = datetime.datetime.now() success = "{0} Loaded {1} found {2} entries".format(now, input_b, len(bib_database.entries)) print(success) else : now = datetime.datetime.now() errs = "{0} Failed to read {1}".format(now, input_b) print(errs) sys.exit(errs) bibtex_str = None if bib_database: writer = BibTexWriter() writer.order_entries_by = ('author', 'year', 'type') bibtex_str = bibtexparser.dumps(bib_database, writer) #print(str(bibtex_str)) with open(output_b, "w") as text_file: #print(bibtex_str, file=text_file) # it does not work print >> text_file , bibtex_str.encode('utf-8') if bibtex_str: now = datetime.datetime.now() success = "{0} Wrote to {1} with len {2}".format(now, output_b, len(bibtex_str)) print(success) else: now = datetime.datetime.now() errs = "{0} Failed to write {1}".format(now, output_b) print(errs)sys.exit(errs)
Assuming that your .bib file is named library.bib and it is located in the same folder of the script, it is possible to customize the unwanted list of fields according to your preferences that are required in your LaTex project. I personally run the clean_bib.py by the following Bash script that also allows me to set the German-style of the entries.
#!/bin/bash
python clean_bib.py
bibclean -German-style lib_unused_fields.bib > library_clean.bib
Finally, most likely your LaTex project does not cite all the references that are in your library library_clean.bib. Therefore, all the non-cited entries must be removed to create a custom bibliography. Assuming that your main LaTex file is named main.tex and that your library is located under the bib/ folder, the following scripts only exports the cited references of your LaTex project and place them in the file bib/exported_refs.bib.
#!/bin/bashfile="bib/exported_refs.bib" if [ -f $file ] ; then rm $file fi bibexport -o bib/exported_refs.bib main.aux
I hope that this post has been useful for many of you who like me have struggled with formatting the bibliography of different LaTex projects. Please feel free to comment and share your own solution. I would be happy to learn new and more efficient techniques.
Comments
Post a Comment