Package commons :: Module strs
[hide private]
[frames] | no frames]

Module strs

source code

String formatting, encoding, etc.

Classes [hide private]
  str_test
Functions [hide private]
 
format(*args)
Formats the args as they would be by the print built-in.
source code
 
safe_ascii(s)
Casts a Unicode string to a regular ASCCII string.
source code
 
cp1252_to_unicode(x)
Converts characters 0x80 through 0x9f to their proper Unicode equivalents.
source code
 
unwrap(s)
Joins a bunch of lines.
source code
 
indent(s, ind=' ')
Prefixes each line in s with ind.
source code
 
unindent(text, amt=None)
If amt is 0, removes all leading whitespace from each line in text.
source code
 
remove_empty_lines(s)
Removes all empty lines (or lines of just whitespace).
source code
 
underline(s, sep)
Appends to s a newline and a number of repetitions of sep; the number of repetitions is the length of s.
source code
 
dos2unix(s)
Removes carriage returns.
source code
 
quotejs(s)
Escape a string as a JavaScript unicode string literal.
source code
 
unicode2html(s)
Extends cgi.escape() with escapes for all unicode characters.
source code
 
html2unicode(text)
Sort of a cgi.unescape (doesn't exist).
source code
 
nat_lang_join(xs, last_glue, two_glue=None, glue=', ')
Natural-language join.
source code
 
or_join(xs) source code
 
and_join(xs) source code
Variables [hide private]
  cp1252_to_unicode_translations = [(u'', u''), (u'', u''), ...
  unicode_special = re.compile(r'[\x80-\uffff]')
  __package__ = 'commons'
Function Details [hide private]

safe_ascii(s)

source code 

Casts a Unicode string to a regular ASCCII string. This may be lossy.

cp1252_to_unicode(x)

source code 

Converts characters 0x80 through 0x9f to their proper Unicode equivalents. See http://www.intertwingly.net/stories/2004/04/14/i18n.html for the nice translation table on which this is based.

unwrap(s)

source code 

Joins a bunch of lines. s is either a single string (which will be split on newlines into a list of strings) or a list of strings (representing lines).

indent(s, ind=' ')

source code 

Prefixes each line in s with ind. s can be either a string (which will be broken up into a list of lines) or a list of strings (treated as lines). Returns a single (indented) string.

unindent(text, amt=None)

source code 

If amt is 0, removes all leading whitespace from each line in text. If amt is None, finds the smallest amount of leading whitespace on any non-empty line and removes that many chars from each line. If amt is positive, removes amt chars from each line.

html2unicode(text)

source code 

Sort of a cgi.unescape (doesn't exist). Removes HTML or XML character references and entities from a text string.

http://effbot.org/zone/re-sub.htm#unescape-html

nat_lang_join(xs, last_glue, two_glue=None, glue=', ')

source code 

Natural-language join. Join a sequence of strings together into a comma-separated list, but where the last pair is joined with the given special glue. (You may also override the non-last glue, which defaults to a ', '.)

Parameters:
  • xs - The sequence of strings. This must be a list-like sequence, not a generated one.
  • last_glue - The string used to join the final pair of elements, when there are more than two elements.
  • two_glue - The string used to join both elements in a 2-element sequence. Defaults to None, which means to use last_glue.
  • glue - The string used to join all the other elements.

Variables Details [hide private]

cp1252_to_unicode_translations

Value:
[(u'', u''),
 (u'', u''),
 (u'ƒ', u'ƒ'),
 (u'', u''),
 (u'', u''),
 (u'', u''),
 (u'', u''),
 (u'ˆ', u'ˆ'),
...