Package commons :: Module strs

[frames] | no frames]

Module strs

source code

String formatting, encoding, etc.

Classes

[hide private]

str_test

Functions

[hide private]

format(*args)
Formats the args as they would be by the print built-in. source code

safe_ascii(s)
Casts a Unicode string to a regular ASCCII string.

source code

cp1252_to_unicode(x)
Converts characters 0x80 through 0x9f to their proper Unicode equivalents.

source code

unwrap(s)
Joins a bunch of lines.

source code

indent(s, ind=' ')
Prefixes each line in s with ind. source code

unindent(text, amt=None)
If amt is 0, removes all leading whitespace from each line in text. source code

remove_empty_lines(s)
Removes all empty lines (or lines of just whitespace).

source code

underline(s, sep)
Appends to s a newline and a number of repetitions of sep; the number of repetitions is the length of s. source code

dos2unix(s)
Removes carriage returns.

source code

quotejs(s)
Escape a string as a JavaScript unicode string literal.

source code

unicode2html(s)
Extends cgi.escape() with escapes for all unicode characters.

source code

html2unicode(text)
Sort of a cgi.unescape (doesn't exist).

source code

nat_lang_join(xs, last_glue, two_glue=None, glue=', ')
Natural-language join. source code

or_join(xs)

source code

and_join(xs)

source code

Variables

[hide private]

cp1252_to_unicode_translations = [(u'€', u'€'), (u'‚', u'‚'), ...

unicode_special = re.compile(r'[\x80-\uffff]')

__package__ = 'commons'

Function Details

[hide private]

safe_ascii(s)

source code

Casts a Unicode string to a regular ASCCII string. This may be lossy.

cp1252_to_unicode(x)

source code

Converts characters 0x80 through 0x9f to their proper Unicode equivalents. See http://www.intertwingly.net/stories/2004/04/14/i18n.html for the nice translation table on which this is based.

unwrap(s)

source code

Joins a bunch of lines. s is either a single string (which will be split on newlines into a list of strings) or a list of strings (representing lines).

indent(s, ind=`'` `'`)

source code

Prefixes each line in s with ind. s can be either a string (which will be broken up into a list of lines) or a list of strings (treated as lines). Returns a single (indented) string.

unindent(text, amt=None)

source code

If amt is 0, removes all leading whitespace from each line in text. If amt is None, finds the smallest amount of leading whitespace on any non-empty line and removes that many chars from each line. If amt is positive, removes amt chars from each line.

html2unicode(text)

source code

Sort of a cgi.unescape (doesn't exist). Removes HTML or XML character references and entities from a text string.

http://effbot.org/zone/re-sub.htm#unescape-html

nat_lang_join(xs, last_glue, two_glue=None, glue=`',` `'`)

source code

Natural-language join. Join a sequence of strings together into a comma-separated list, but where the last pair is joined with the given special glue. (You may also override the non-last glue, which defaults to a ', '.)

Parameters:

xs - The sequence of strings. This must be a list-like sequence, not a generated one.
last_glue - The string used to join the final pair of elements, when there are more than two elements.
two_glue - The string used to join both elements in a 2-element sequence. Defaults to None, which means to use last_glue.
glue - The string used to join all the other elements.

Variables Details

[hide private]

cp1252_to_unicode_translations

Value:

[(u'€', u'€'),
 (u'‚', u'‚'),
 (u'ƒ', u'ƒ'),
 (u'„', u'„'),
 (u'…', u'…'),
 (u'†', u'†'),
 (u'‡', u'‡'),
 (u'ˆ', u'ˆ'),
...

Module strs

safe_ascii(s)

cp1252_to_unicode(x)

unwrap(s)

indent(s, ind=' ')

unindent(text, amt=None)

html2unicode(text)

nat_lang_join(xs, last_glue, two_glue=None, glue=', ')

cp1252_to_unicode_translations

indent(s, ind=`'` `'`)

nat_lang_join(xs, last_glue, two_glue=None, glue=`',` `'`)