Welcome. Today's question: Which 5 escape sequences should Python
beginners learn first?
I'm Paul, and I add programmers add structure to their learning process,
because it all can be so overwhelming.
So here we'll see how escape sequences help us display text and how to
find them yourself. And for structure, we'll view the whole list of 16
and whittle that down to the 5 beginners should learn first, leaving
the rest for later.
In the next video (tutorial) of our data science journey, we'll cover
common math functions.
(Commands in Linux)
python3
less
(Escape sequences and functions in Python)
\\
\'
\"
\t
\n
help()
print()
Step 1 - What Do Escape Sequences Do?
Let's head to the Linux Terminal.
paul@fullstack:~$ less notes/python_sequences.txt
Our list of escape sequences for Python 3.4.2
What I meant by adding structure is creating tables and checklists to
not only scope out what we don't know, but to monitor our progress.
Python 3.4.2 Escape Sequences
Escape Sequences allow us to print characters that have other
meanings to Python.
------------------------------------------------------
| Sequence | Effect |
------------------------------------------------------
| \newline | backslash and newline ignored |
------------------------------------------------------
| \\ (31) | second backslash printed |
------------------------------------------------------
| \' (31) | single-quote printed |
------------------------------------------------------
| \" (31) | double-quote printed |
------------------------------------------------------
| \a | bell or system sound |
------------------------------------------------------
| \b | backspace |
------------------------------------------------------
| \f | formfeed |
------------------------------------------------------
| \n (31) | linefeed |
------------------------------------------------------
| \r | carriage return |
------------------------------------------------------
| \t (31) | horizontal tab |
------------------------------------------------------
| \v | vertical tab |
------------------------------------------------------
| \ooo | character with octal value |
------------------------------------------------------
| \xhh | character with hex vvalue |
------------------------------------------------------
| \N{name} | character named in unicode |
------------------------------------------------------
| \uxxxx | character with 16-bit hex value |
------------------------------------------------------
| \uXXXXXXXX | character with 32-bit hex value |
------------------------------------------------------
notes/python_sequences.txt
Here's a table with 16 escape sequences in Python 3.4.2.
What are escape sequences in Python for?
In a nutshell, escape sequences allow us to print characters that have
other meanings to Python, or Linux for that matter. And here I note
the video (tutorial) numbers where I introduce something new.
As mentioned in the last video (tutorial), systems originally used
ASCII to organize all 128 characters on the English language keyboard,
plus capital letters, numbers and non-alphanumeric symbols.
paul@fullstack:~$ python3
Python 3.4.2 (default, Oct 8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a
Then there are others we don't normally think about, like the Null,
carriage return, escape, space and delete characters.
Think about it. We know what the character
a looks like this, but what does
backspace look like? We know what it does, but what does it look like
to the computer, right?
And then, later, advancements in Unicode, added characters in other
languages, obviously 128 was not enough. Links are provided to those
articles (on Wikipedia).
Step 2 - Find Escape Sequences in Python for Your Version
Access local Python help() documentation
Let's find escape sequences in the local Python documentation so we
don't get sidetracked on the web.
From within Python, open interactive help like this.
>>> help()
Welcome to Python 3.4's help utility!
If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/3.4/tutorial/.
Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules. To quit this help utility and
return to the interpreter, just type "quit".
To get a list of available modules, keywords, symbols, or topics, type
"modules", "keywords", "symbols", or "topics". Each module also comes
with a one-line summary of what it does; to list the modules whose name
or summary contain a given string such as "spam", type "modules spam".
help>
Escape sequences use this \ backslash
character, so let's try it, \Enter.
(You could also use
help> 'STRINGS').
help> \
Step 3 - View Full List of Escape Sequences
That opens a help page on characters and
j to go down,
k to go up.
String and Bytes literals
*************************
String literals are described by the following lexical definitions:
stringliteral ::= [stringprefix](shortstring | longstring)
stringprefix ::= "r" | "u" | "R" | "U"
shortstring ::= "'" shortstringitem* "'" | '"' shortstringitem* '"'
longstring ::= "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
shortstringitem ::= shortstringchar | stringescapeseq
longstringitem ::= longstringchar | stringescapeseq
shortstringchar ::= <any source character except "\" or newline or the quote>
longstringchar ::= <any source character except "\">
stringescapeseq ::= "\" <any source character>
bytesliteral ::= bytesprefix(shortbytes | longbytes)
bytesprefix ::= "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB"
shortbytes ::= "'" shortbytesitem* "'" | '"' shortbytesitem* '"'
longbytes ::= "'''" longbytesitem* "'''" | '"""' longbytesitem* '"""'
shortbytesitem ::= shortbyteschar | bytesescapeseq
longbytesitem ::= longbyteschar | bytesescapeseq
shortbyteschar ::= <any ASCII character except "\" or newline or the quote>
longbyteschar ::= <any ASCII character except "\">
bytesescapeseq ::= "\" <any ASCII character>
One syntactic restriction not indicated by these productions is that
whitespace is not allowed between the "stringprefix" or "bytesprefix"
and the rest of the literal. The source character set is defined by
the encoding declaration; it is UTF-8 if no encoding declaration is
given in the source file; see section *Encoding declarations*.
In plain English: Both types of literals can be enclosed in matching
single quotes ("'") or double quotes ("""). They can also be enclosed
in matching groups of three single or double quotes (these are
generally referred to as *triple-quoted strings*). The backslash
("\") character is used to escape characters that otherwise have a
special meaning, such as newline, backslash itself, or the quote
character.
Bytes literals are always prefixed with "'b'" or "'B'"; they produce
an instance of the "bytes" type instead of the "str" type. They may
only contain ASCII characters; bytes with a numeric value of 128 or
greater must be expressed with escapes.
As of Python 3.3 it is possible again to prefix unicode strings with a
"u" prefix to simplify maintenance of dual 2.x and 3.x codebases.
Both string and bytes literals may optionally be prefixed with a
letter "'r'" or "'R'"; such strings are called *raw strings* and treat
backslashes as literal characters. As a result, in string literals,
"'\U'" and "'\u'" escapes in raw strings are not treated specially.
Given that Python 2.x's raw unicode literals behave differently than
Python 3.x's the "'ur'" syntax is not supported.
New in version 3.3: The "'rb'" prefix of raw bytes literals has
been added as a synonym of "'br'".
New in version 3.3: Support for the unicode legacy literal
("u'value'") was reintroduced to simplify the maintenance of dual
Python 2.x and 3.x codebases. See **PEP 414** for more information.
In triple-quoted strings, unescaped newlines and quotes are allowed
(and are retained), except that three unescaped quotes in a row
terminate the string. (A "quote" is the character used to open the
string, i.e. either "'" or """.)
Unless an "'r'" or "'R'" prefix is present, escape sequences in
strings are interpreted according to rules similar to those used by
Standard C. The recognized escape sequences are:
+-------------------+-----------------------------------+---------+
| Escape Sequence | Meaning | Notes |
+===================+===================================+=========+
| "\newline" | Backslash and newline ignored | |
+-------------------+-----------------------------------+---------+
| "\\" | Backslash ("\") | |
+-------------------+-----------------------------------+---------+
| "\'" | Single quote ("'") | |
+-------------------+-----------------------------------+---------+
| "\"" | Double quote (""") | |
+-------------------+-----------------------------------+---------+
| "\a" | ASCII Bell (BEL) | |
+-------------------+-----------------------------------+---------+
| "\b" | ASCII Backspace (BS) | |
+-------------------+-----------------------------------+---------+
| "\f" | ASCII Formfeed (FF) | |
+-------------------+-----------------------------------+---------+
| "\n" | ASCII Linefeed (LF) | |
+-------------------+-----------------------------------+---------+
| "\r" | ASCII Carriage Return (CR) | |
+-------------------+-----------------------------------+---------+
| "\t" | ASCII Horizontal Tab (TAB) | |
+-------------------+-----------------------------------+---------+
| "\v" | ASCII Vertical Tab (VT) | |
+-------------------+-----------------------------------+---------+
| "\ooo" | Character with octal value *ooo* | (1,3) |
+-------------------+-----------------------------------+---------+
| "\xhh" | Character with hex value *hh* | (2,3) |
+-------------------+-----------------------------------+---------+
Escape sequences only recognized in string literals are:
+-------------------+-----------------------------------+---------+
| Escape Sequence | Meaning | Notes |
+===================+===================================+=========+
| "\N{name}" | Character named *name* in the | (4) |
| | Unicode database | |
+-------------------+-----------------------------------+---------+
| "\uxxxx" | Character with 16-bit hex value | (5) |
| | *xxxx* | |
+-------------------+-----------------------------------+---------+
| "\Uxxxxxxxx" | Character with 32-bit hex value | (6) |
| | *xxxxxxxx* | |
+-------------------+-----------------------------------+---------+
Notes:
1. As in Standard C, up to three octal digits are accepted.
2. Unlike in Standard C, exactly two hex digits are required.
3. In a bytes literal, hexadecimal and octal escapes denote the
byte with the given value. In a string literal, these escapes
denote a Unicode character with the given value.
4. Changed in version 3.3: Support for name aliases [1] has been
added.
5. Individual code units which form parts of a surrogate pair can
be encoded using this escape sequence. Exactly four hex digits are
required.
6. Any Unicode character can be encoded this way. Exactly eight
hex digits are required.
Unlike Standard C, all unrecognized escape sequences are left in the
string unchanged, i.e., *the backslash is left in the string*. (This
behavior is useful when debugging: if an escape sequence is mistyped,
the resulting output is more easily recognized as broken.) It is also
important to note that the escape sequences only recognized in string
literals fall into the category of unrecognized escapes for bytes
literals.
Even in a raw string, string quotes can be escaped with a backslash,
but the backslash remains in the string; for example, "r"\""" is a
valid string literal consisting of two characters: a backslash and a
double quote; "r"\"" is not a valid string literal (even a raw string
cannot end in an odd number of backslashes). Specifically, *a raw
string cannot end in a single backslash* (since the backslash would
escape the following quote character). Note also that a single
backslash followed by a newline is interpreted as those two characters
as part of the string, *not* as a line continuation.
Related help topics: str, UNICODE, SEQUENCES, STRINGMETHODS, FORMATTING,TYPES
Here are 13, and another 3 below that. See if your version matches mine.
help> quit
You are now leaving help and returning to the Python interpreter.
If you want to ask for help on a particular object directly from the
interpreter, you can typ "help(object)". Executing "help('sting')"
has the same effect as typing a particular string at the help> prompt.
>>>
Step 4 - Practice with \\, \', \", \t, \n
The Python double backslash escape sequence \\
First, double backslash (\\).
The backslash character identifies that a special instruction is coming
when used in a block of text.
Let's try one. Using the print function, let's
"Print text with a \".
>>> print("Print text with a \")
File "<stdin>", line 1
print("Print text with a \")
^
SyntaxError: EOL while scanning string literal
>>>
Why didn't that work? We followed the rules. The function
print(), surrounded it with
double-quotes, closed it with closing parentheses. What went wrong?
The backslash character has a special meaning, so to use one, we need
two backslashes.
>>> print("Print text with a \\")
Print text with a \
The Python single-quote escape sequence \'
Next, what if you wanted to type a single-quote in a text block defined
with single-quotes?
See how it doesn't work this way.
>>> print('How's your day?')
File "<stdin>", line 1
print('How's your day?')
^
SyntaxError: invalid syntax
>>>
To Python, the second single-quote closed the text string. So in this
case, we'd need to include the escape sequence.
>>> print('How\'s your day?')
How's your day?
>>>
The Python double-quote escape sequence \"
Third, the backslash-double-quote. This is very similar.
Remember, Python gives us two ways to identify a block of text.
We can surround it with double-quotes if, for example, we want to
include an apostrophe inside, avoiding the need for an escape sequence
altogether.
>>> print("He's funny.")
He's funny.
>>>
See how "He's funny" works?
Now what if we wanted to say he's "funny" in quotes.
It won't work because the closing double-quote ended here, and this
"funny" confused Python, so it gave an error.
We can correct it by adding backslash-double-quote characters in two
places around the word funny.
>>> print("He's \"funny\".")
He's "funny".
>>>
Good.
The Python tab escape sequence \t
Fourth, let's add a tab to a text block by adding a backslash-t
(\t).
>>> print("The \\t adds a \ttab")
The \t adds a tab
See how I escaped the backslash here so it printed
\t, but here it listened to the tab
request.
The Python new line escape sequence \n (aka, newline)
Fifth, let's add a new line character which is just going to the next
line using the backslash-n (\n)
combination.
>>> print("The \\n goes to a \\nnew line")
The \n goes to a
new line
>>> exit()
So you'd be surprised. These five escape sequences will get you pretty
far, and the rest of our list we can put aside for now.
Step 5 - Next: Basic Math Functions
For those sticking around, please subscribe and join us on our
journey.
Client : HTML, CSS, JavaScript
Software : Python Scientific Stack
Data : PostgreSQL, MySQL / MariaDB
OS : Linux (command line), Debian
This was video (tutorial) #31, and we're only a couple videos away from
kicking off Project 4, where we add a new piece of software to our
stack. How exciting!
Have a nice day.
What's Next?
Our YouTube Channel is growing fast, subscribe here to be a part of it.
To access all tutorials, click Outline.
To learn about Python text functions, click Back.
See how it feels, click Tip.
To see 8 builtin math functions in action, click Next.