Properties

$latex_special_chars_dictionary

$latex_special_chars_dictionary : array

An associative array of regular expressions that match latex code and utf8 representations of the respective symbol they represent

As PHP does not allow the initialization of static variables from static member methods we have to initialize this array to null. You may thus never use this variable directly, instead use the getter method below.

Type

array — An associative array of regular expressions that match latex code and utf8 representations of the respective symbol they represent

$latex_clean_up_dictionary

$latex_clean_up_dictionary : 

LaTeX cleanup dictionary

After we have performed the above replacements in the following function we clean up remaining special characters accoding to the following list.

Type

$latex_special_chars_reverse_dictionary

$latex_special_chars_reverse_dictionary : 

Associative array of utf8 representations and their respective LaTeX representations.

In some cases we need to generate LaTeX code from utf-8 encoded strings containing special characters. This array helps in doing this.

Type

— Dictionary of special characters and their LaTeX representations.

$utf8_to_ascii_dictionary

$utf8_to_ascii_dictionary : 

Associative array of utf8 representations and their closest ascii characters.

In some cases we need to generate ascii strings from utf-8 encoded strings containing special characters. This array helps in doing this.

Type

— Dictionary of special character to closes ascii versions.

$additional_default_macros

$additional_default_macros : array

Additional macros.

Array of default macros that we always want to expand, irrespective of whether they appeared in a LaTeX file.

Type

array — Array of additional macros that we often want to expand even if they were not explicitely defined by the user.

Methods

get_latex_special_chars_dictionary()

get_latex_special_chars_dictionary() 

Get the $latex_special_chars_dictionary.

We are dealing with a lot of LaTeX code in wich special characters are LaTeX encoded. On our website we want to display them in a pretty way. The following is a (more or less complete) mapping of LaTeX encodigs into utf-8 characters whenever such a character is available. This list must only contain latex macros that contain a \ as for preformance reasons replacing is stopped once all \ have been eliminated. Also there is a negative look must be added when using this dictionary to prevent replacements in cases such as \v{a}.

get_latex_clean_up_dictionary()

get_latex_clean_up_dictionary() 

Get the $latex_clean_up_dictionary.

get_latex_special_chars_reverse_dictionary()

get_latex_special_chars_reverse_dictionary() 

Get the $latex_special_chars_reverse_dictionary.

get_utf8_to_ascii_dictionary()

get_utf8_to_ascii_dictionary() 

Get the $utf8_to_ascii_dictionary.

expand_cite_to_html()

expand_cite_to_html(string  $text, string  $bbl) 

Expand \cite{} commands to html code.

The resulting html code makes every \cite a clickable hyperlink to the correspnding entry in the bibliograhy.

Parameters

string $text

The text in which \cite commands are to be expanded

string $bbl

The bbl code of the bibliography that contains the corresponding bibliography entries.

normalize_whitespace_and_linebreak_characters()

normalize_whitespace_and_linebreak_characters(string  $text, boolean  $single_line = true, boolean  $remove_extra_newlines = false) 

Normalized white space and line break characters.

Parameters

string $text

LaTeX text to normalize.

boolean $single_line

Whether to output a single-line text.

boolean $remove_extra_newlines

If true, single newlines are replaced by space and any number of more than two successive newlines are replaced by exactly two newlines.

remove_font_changing_commands()

remove_font_changing_commands(string  $text) 

Remove font changing commands.

Parameters

string $text

LaTeX text to remove font commands from.

extract_bibliographies()

extract_bibliographies(string  $latex) 

Extract all bibliographies from latex code.

Parameters

string $latex

Latex code to search for bibliographies.

extract_abstracts()

extract_abstracts(string  $latex) 

Extract all abstracts from latex code.

Parameters

string $latex

Latex code to search for abstracts.

un_escape_url()

un_escape_url(string  $latex_url) : string

Un-escape a LaTeX style escaped url.

Parameters

string $latex_url

LaTeX style escaped url.

Returns

string —

Url without LaTeX escape sequences.

latex_to_utf8_outside_math_mode()

latex_to_utf8_outside_math_mode(string  $latex_text, boolean  $clean = true) : string

Convert LaTeX code to utf8, leaving alone everything within math mode.

Does not attempt to deal with commands that cannot be reasonably represented in utf8, such as \small, \newblock, \emph, ...

Leaves mathematical formulas enclosed in $...$ (and other math mode delimiters such as $$...$$,...) intact so that they can be displayed nicely using MathJax. This function turns various the math modes into linline mode $a+b$.

Parameters

string $latex_text

Latex code whose non-math part is to be converted to utf8

boolean $clean

Whether to perform some cleanup at the end.

Returns

string —

A utf8 approximation to $latex_text

preg_split_at_latex_math_mode_delimters()

preg_split_at_latex_math_mode_delimters(string  $text, integer  $limit = -1, integer  $flags) 

Preg split LaTeX code at math mode delimters.

If $text is well formed, Odd parts of the resulting array are outside math mode and even parts are inside math mode. If $fags is not specified or set to 0 the delimiters are stripped.

Parameters

string $text

Text to be split at LaTeX math mode delimiters such as $...$ [...] (...).

integer $limit

If specified, then only substrings up to limit are returned with the rest of the string being placed in the last substring. A limit of -1 or 0 means "no limit" and, as is standard across PHP, you can use NULL to skip to the flags parameter.

integer $flags

Flags can be any combination of valid preg_split flags. Defaults to 0.

strpos_outside_math_mode()

strpos_outside_math_mode(string  $latex_text, string  $string) 

Strpos in latex code, but only taking into account the part of code that is not in math mode. This function uses the multi byte safe mb_str_pos() function.

Parameters

string $latex_text

Latex text in whose non-math parts the string is to be found.

string $string

String to be found.

preg_match_outside_math_mode()

preg_match_outside_math_mode(string  $pattern, string  $subject, array  $matches = array(), integer  $flags, integer  $offset) : integer

Preg match in latex code, but only taking into account the part of code that is not in math mode.

Parameters

string $pattern

Reular expression to match against.

string $subject

Latex text in whose non-math parts the expression is to be found.

array $matches

If matches is provided, then it is filled with the results of search. $matches[0] will contain an array of texts that matched the full pattern, $matches[1] will have an array of the texts that matched the first captured parenthesized subpattern, and so on.

integer $flags

Flags as in preg_match().

integer $offset

Place from which to start the search (in bytes) within each segment of subjbect that is outisde math mode.

Returns

integer —

False in case an error occurred during any of the matches. Alternatively returns the total number of segments in which a match was found.

parse_bbl()

parse_bbl(string  $bbl) 

Parse bbl code of potentially multiple bibliographies.

Parses bbl code produced by either bibtex or biblatex as well as such written by authors by hand. $bbl may be a concatenation of multiple bibliographies.

Parameters

string $bbl

Bibliography or concatenation of bibliographies in .bbl format as it is produced by BibTeX, Biber, and BibLaTeX.

parse_single_bbl()

parse_single_bbl(string  $bbl) 

Parse bbl code of an individual bibliography.

Parses bbl code produced by either bibtex or biblatex as well as such written by authors by hand.

Parameters

string $bbl

Bibliography in .bbl format as it is produced by BibTeX, Biber, and BibLaTeX.

utf8_to_latex()

utf8_to_latex(string  $text) 

Convert utf8 text to LaTeX code.

Parameters

string $text

Text with special characters that are to be converted to LaTeX encoding.

utf8_to_bibtex()

utf8_to_bibtex(string  $text) 

Convert utf8 text to LaTeX code suitable for BibTeX.

Parameters

string $text

Text with special characters that are to be converted to a LaTeX type encoding suitable for the use in BibTeX .bib files.

extract_latex_macros()

extract_latex_macros(string  $latex_source) 

Extracts LaTeX command definitions from latex code.

Extracts LaTeX command definitions from latex code.

Supported are:

newcommand providecommand renewcommand renewcommand. def DeclarePairedDelimiter

For a command such as

\newcommand{\myvec}[1]{\vec{#1}}

the resulting definition is of the following form:

Full match 0-22 \newcommand{\myvec}[1] Group 1. 1-11 newcommand Group 2. 12-18 \myvec Group 3. 19-22 [1] Group 4. 22-22 ` (in case a default is set making an argument optional) Group 5. 23-31\vec{#1}`

Parameters

string $latex_source

LaTeX source code from with the macro definitions are to be extracted.

get_special_macros_to_ignore_in_bbl()

get_special_macros_to_ignore_in_bbl() 

Get special macros we sometimes want to ignore in expansion.

Some macros are better kept unexpended even if the authors have manually re-defined them, because we need them to more efficiently parse the latex code (i.e., to identify DOIs or URLs).

remove_special_macros_to_ignore_in_bbl()

remove_special_macros_to_ignore_in_bbl(array  $latex_macro_definitions) 

Remove special macros we sometimes want to ignore in expansion.

Parameters

array $latex_macro_definitions

Array of latex macro definitions.

expand_latex_macros()

expand_latex_macros(array  $macro_definitions, string  $text) 

Expand LaTeX macros.

Attempts to expands all macros in $text for which a definition is given in $macro_definitions.

Expects definitions like those produced by self::extract_latex_macros().

Parameters

array $macro_definitions

Array of macro definitions.

string $text

Text containing LaTeX macros that are to be expanded.

get_month_string()

get_month_string(\string/int  $month) 

Get the BibTeX string representation of a numeric month

Parameters

\string/int $month

Number of the month in the range 1-12.

utf8_to_closest_latin_letter_string()

utf8_to_closest_latin_letter_string(string  $text) 

Convert utf8 strings to the closes latin letter string.

Parameters

string $text

Text with utf8 special characters.

title_to_key_suffix()

title_to_key_suffix(string  $text) 

Compute a string suitable for taking the role of the BibTeX key from a text such as a title.

Parameters

string $text

Title of an article from which a BibTeX entry key is to be generated.

match_single_non_character_makro_regexp_fragment()

match_single_non_character_makro_regexp_fragment(string  $char) 

Utility function to construct the $latex_special_chars_dictionary.

Parameters

string $char

The non-character symbol characteristic to the respective latex macro.

match_single_character_makro_regexp_fragment()

match_single_character_makro_regexp_fragment(string  $char) 

Utility function to construct the $latex_special_chars_dictionary.

Parameters

string $char

The character characteristic to the respective latex macro.