Python (programming language): Difference between revisions
imported>Aleksander Stos m (→Files: simplification) |
imported>Eric M Gearhart (Clarified a lot, added line numbers, attempting to make this section more accessible to non-programmers) |
||
Line 78: | Line 78: | ||
The following script counts the images on the Citizendium [[Main Page]]. | The following script counts the images on the Citizendium [[Main Page]]. | ||
<pre> | <pre> | ||
import urllib2 | 1 import urllib2 | ||
cnt=0 | 2 cnt=0 | ||
for line in urllib2.urlopen('http://en.citizendium.org/wiki/Main_Page'): | 3 for line in urllib2.urlopen('http://en.citizendium.org/wiki/Main_Page'): | ||
4 cnt += line.count('<img src') | |||
print cnt | 5 print cnt | ||
</pre> | </pre> (note that the line numbers were added for clarity - they are not needed in an actual Python script) | ||
In line 1, we import a module from the standard library. Python's standard library is said to be rich and often regarded as Python's strong point. In fact, Python official pages declare a "batteries included" philosophy. | |||
In line 2, the variable named 'cnt' is set to zero initially (a best practice in programming) | |||
In line 3, the [[URL]] is fetched using the urllib2 function. This merits some discussion. In Python, the source [[HTML]] files of web pages can be treated much like the local files, e.g. processed line-by-line with a <tt>for</tt> loop. In the above example the variable <tt>line</tt> is a string containing a piece of HTML code of Citizendium's Main Page. Embedded images are inserted in this code with a text that begins with '<img src'. So it's enough to count instances of this last string. | |||
In line 4 (inside the for loop), the appropriate method <tt>count()</tt> is used to increment the cnt variable by one every time the [[HTML]] tag beginning with "<img src" is encountered | |||
==Syntax== | ==Syntax== |
Revision as of 21:37, 28 June 2009
Python is a dynamic object-oriented, general purpose interpreted programming language which runs on many different computer platforms and mobile devices. Python is open source software and is published under an OSI-approved license. Python aims to be a language that is efficient, coherent, readable, and fun to use. Because Python is an interpreted language, Python programs run immediately without the need for lengthy compile and link steps.
History
Python was first published by the Dutch computer scientist (and applied mathematician) Guido van Rossum in pre-release (early-adopter) form in 1991, and to this day he remains the project leader and final arbiter of Python Enhancement Proposals (PEPs).
Python 2.5.1 is the current production release, and is very stable.
Python's major (standard) releases were:
- Python 2.5.1 (April 2007)
- Python 2.4.4 (October 2006)
- Python 2.3.6 (November 2006)
- Python 2.2.3 (May 2003)
- Python 2.1.3 (April 2002)
- Python 2.0.1 (June 2001)
- Python 1.6.1 (September 2000)
- Python 1.5.2 (April 1999)
- Python 1.4 (October 1996)
- Python 1.3 (October 1995)
- Python 1.2 (April 1995)
- Python 1.1 (1995)
- Python 1.0 (1994)
Currently under development is a full refactoring for Python (version 3.0), which is known as Python 3000 (Py3k), and which is now in its early alpha (pre-production) stage.
Examples
Hello World
The code for the "hello world" program can hardly be more simple:
print 'Hello World'
This can be put in a file "hello.py", for example, and executed with
python hello.py
from your operating system's command line. Alternatively, the code can be typed directly into an interactive Python environment (the Python command line interpreter or IDLE, both of which are included in the standard Python distribution).
Calculator
The Python interpreter can be invoked from the command line and used as a scientific calculator. At the prompt (denoted here by >>>
), type
>>> 2+3*(1+1)
the interpreter will print 8. Division of integers returns an integer result, so
>>> 7/2
is 3 (floor from the exact result). If a real result is needed, then at least one operand should be a real number, as shown below
>>> 7.0/2
More interesting functions can be found in the math module. It must be imported before it can be used.
>>> from math import *
>>> print sin(pi/2)
High quality graphs may be obtained with the matplotlib library[1]
Files
A useful Python construction is related to working with files. Line-by-line Perl-like file processing can be realized with a standard for
loop as in the following simple word count script. Below, the comments start with the hash '#' sign.
(char_count,word_count,line_count)=0,0,0 # multiple assignment is available file = open ('myfile.txt') # standard opening of the file for line in file: # this is "idiomatic" use of of the "for" loop wordlist = line.split() # splitting line into the list of words word_count += len(wordlist) # counting words line_count += 1 # counting lines char_count += len(line) - 1 # counting characters print line_count, word_count, char_count # we're done
The indentation indicates what code is executed within the loop. This is part of Python's syntax. When the end of file is reached, the loop terminates and the results are printed by the last line of the code. Note also that the variable line
contains the terminating newline character, so that 1 is subtracted from its length for the character counting. Another interesting observation can be made: the method (i.e. function) of splitting a string is 'provided' by this very string. Indeed, Python is an objective language; the string variables are objects and related methods are its attributes.
Internet access
The following script counts the images on the Citizendium Main Page.
1 import urllib2 2 cnt=0 3 for line in urllib2.urlopen('http://en.citizendium.org/wiki/Main_Page'): 4 cnt += line.count('<img src') 5 print cnt
(note that the line numbers were added for clarity - they are not needed in an actual Python script)
In line 1, we import a module from the standard library. Python's standard library is said to be rich and often regarded as Python's strong point. In fact, Python official pages declare a "batteries included" philosophy.
In line 2, the variable named 'cnt' is set to zero initially (a best practice in programming)
In line 3, the URL is fetched using the urllib2 function. This merits some discussion. In Python, the source HTML files of web pages can be treated much like the local files, e.g. processed line-by-line with a for loop. In the above example the variable line is a string containing a piece of HTML code of Citizendium's Main Page. Embedded images are inserted in this code with a text that begins with '<img src'. So it's enough to count instances of this last string.
In line 4 (inside the for loop), the appropriate method count() is used to increment the cnt variable by one every time the HTML tag beginning with "<img src" is encountered
Syntax
Remarkable global features of the Python syntax include high readability of the code, which is aided by the use of indentation to separate blocks of code and a general "one statement per line" principle.[2]
Implementations
Python's official distribution is known as CPython. It's written in C and functions as a virtual machine for interpreting bytecode-compiled Python programs. Jython is an implementation for the Java Virtual Machine, which can run either standalone (like CPython) or as an embedded scripting engine. IronPython is an implementation for the Common Language Runtime (.NET and Mono). PyPy is an implementation written in Python that targets several backends, including C, LLVM, JavaScript, JVM and CLR.
Python IDEs
Python can be supported as the programming language in an integrated development environment (IDE).
Popular Python IDEs include the following:
- IDLE, the default Integrated Development Environment for Python
- Eclipse pydev, the pydev plug-in to the Eclipse IDE for Python
- ActiveState Komodo, the multiple scripting languages IDE
- Wingware Wing, the multiplatform Python-language-specific IDE
Notes and references
- ↑ This is installed separately, see its introduction and a couple of examples.
- ↑ Backslash "\" at the end of line allows to break e.g. a long assignment over multiple lines. There is also a formal possibility to put more than one statement in a line by separating them with a semicolon. Still, the general principle shapes the code.