Beautiful Soup
Details
| Size: | 69K |
| Last Update: | 2008-05-02 00:13:46 |
| Version: | 3.0.3 |
| OS Support: | Linux |
| License/Program Type: | Python License |
| Publisher: | Leonard Richardson |
| Price: | $0.00 |
Description:
Beautiful Soup 3.0.3 is markup software developed by Leonard Richardson.
Beautiful Soup project is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping. Three features make it powerful:
Beautiful Soup won't choke if you give it bad markup. It yields a parse tree that makes approximately as much sense as your original document. This is usually good enough to collect the data you need and run away.
Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. You don't have to create a custom parser for each application.
Beautiful Soup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don't have to think about encodings, unless the document doesn't specify an encoding and Beautiful Soup can't autodetect one. Then you just have to specify the original encoding.
Beautiful Soup parses anything you give it, and does the tree traversal stuff for you. You can tell it "Find all the links", or "Find all the links of class externalLink", or "Find all the links whose urls match "foo.com", or "Find the table heading that's got bold text, then give me that text."
Valuable data that was once locked up in poorly-designed websites is now within your reach. Projects that would have taken hours take only minutes with Beautiful Soup.
Requirements:
Python
What's New in This Release:
Beautiful Soup can now convert invalid HTML or XML into something approaching XHTML or valid XML.
Beautiful Soup 3.0.3 supports english interface languages and works with Linux.
Downloading Beautiful Soup 3.0.3 will take several seconds if you use fast ADSL connection.
0 comments
Add to
Beautiful Soup Version History
Related Software
|
|
From category: Markup |
| GDC is a utility to calculate compressible flow (Gas Dynamics Calculator).... |
|
|
From category: IDEs |
| NetBeans IDE is a full-featured integrated environment for Java application developers.... |
|
|
From category: Others |
| FMPP 0.9.11 is others software developed by Daniel Dekany. FMPP is a general-purpose text file preprocessor tool that uses FreeMarker templates. FMPP project is particularly designed for HTML prepr... |
|
|
From category: Others |
| convmv 1.10 is others software developed by Bjorn Jacke. convmv converts filenames (not file content), directories, and even whole filesystems to a different encoding. This comes in very han... |
|
|
From category: Others |
| gedit is the official text editor of the GNOME desktop environment.... |
|
|
From category: Markup |
| EuroMath2 1.4.0 is markup software developed by Martin Vysny. EuroMath2 project is a platform for editors editing various XML files with multiple namespaces. Able to contain and manage editors with... |
|
|
From category: Others |
| peppy itself is an attempt to displace my dependency on XEmacs.... |
|
|
From category: Others |
| Epsilon 13.00 is others software developed by Steven Doerfler. Epsilon is a multi-platform programmers editor that does syntax highlighting, brace matching, etc. Epsilon has built-in support for Pe... |
|
|
From category: Markup |
| ivt2html is a cunning utility that converts .ivt files back into standard HTML.... |
|
|
From category: Markup |
| eArea 1.01 is markup software developed by Oliver Moran. eArea is a simple cross-browser WYSIWYG text editor. It works on Explorer, Firefox/Mozila and Safari/Konqueror. eArea project even wi... |
|
|
From category: Markup |
| SiSU (Serialized information, Structured Units) is is a document creation and management framework.... |
|
|
From category: IDEs |
| EPIC 0.4.0 is ides software developed by Jan Ploski. EPIC is a Perl IDE based on the Eclipse platform. Features supported are syntax highlighting, on-the-fly syntax checking, content assista... |
|
|
From category: Markup |
| itools is a collection of Python libraries which provides a wide range of capabilities.... |
|
|
From category: Markup |
| Amoeba 0.3.5-pre5 is markup software developed by Amoeba Team. Amoeba is XML-based rapid Web development enviroment written entirely in Java. Amoeba XML Framework is xml based development fr... |
|
|
From category: Others |
| bbe 0.2.2 is others software developed by Timo Savinen. bbe is a sed-like editor for binary files. bbe performs basic byte operations on blocks of input stream. bbe is command line tools dev... |
Leave a comment