Rating:

Author: Ian H. Witten
ISBN : B00440E0PS
New from $25.00
Format: PDF
Download books file now Free Managing Gigabytes: Compressing and Indexing Documents and Images, Second Edition from mediafire, rapishare, and mirror link
In this fully updated second edition of the highly acclaimed Managing Gigabytes, authors Witten, Moffat, and Bell continue to provide unparalleled coverage of state-of-the-art techniques for compressing and indexing data. Whatever your field, if you work with large quantities of information, this book is essential reading--an authoritative theoretical resource and a practical guide to meeting the toughest storage and access challenges. It covers the latest developments in compression and indexing and their application on the Web and in digital libraries. It also details dozens of powerful techniques supported by mg, the authors' own system for compressing, storing, and retrieving text, images, and textual images. mg's source code is freely available on the Web.
* Up-to-date coverage of new text compression algorithms such as block sorting, approximate arithmetic coding, and fat Huffman coding
* New sections on content-based index compression and distributed querying, with 2 new data structures for fast indexing
* New coverage of image coding, including descriptions of de facto standards in use on the Web (GIF and PNG), information on CALIC, the new proposed JPEG Lossless standard, and JBIG2
* New information on the Internet and WWW, digital libraries, web search engines, and agent-based retrieval
* Accompanied by a public domain system called MG which is a fully worked-out operational example of the advanced techniques developed and explained in the book
* New appendix on an existing digital library system that uses the MG software
Direct download links available for Free Managing Gigabytes: Compressing and Indexing Documents and Images, Second Edition (The Morgan Kaufmann Series in Multimedia Information and Systems) [Kindle Edition]
- File Size: 11296 KB
- Print Length: 550 pages
- Publisher: Morgan Kaufmann; 1 edition (May 17, 1999)
- Sold by: Amazon Digital Services, Inc.
- Language: English
- ASIN: B00440E0PS
- Text-to-Speech: Enabled
X-Ray:
- Lending: Not Enabled
- Amazon Best Sellers Rank: #769,262 Paid in Kindle Store (See Top 100 Paid in Kindle Store)
Free Managing Gigabytes: Compressing and Indexing Documents and Images, Second Edition
This is the only book there is that will actually teach you how to build an information retrieval system (aka search engine). It discusses all the algorithms and tradeoffs, and comes with free downloadable source code to experiment with. Some of the material is standard, but covered in more implementation detail here than anywhere else. Some of the material is novel: you won't find better coverage of compression unless you hand-assemble twenty research papers, and reverse-engineer them to figure out how they're implemented. But with "Managing Gigabytes", it's all here. (Although, after a particularly envigorating discussion of how to string together a bunch of techniques to compress their corpus and save a couple 100MB, I did a check and found you could buy 512MB of RAM for less than the cost of the book. Knowledge is Power, but sometimes a little cash is more powerful.) The only negative is that this book is not called "Managing Terabytes", as the first edition promised/threatened it might be. RAM and disk are cheap, but not that cheap, and for now terabytes (and sometimes petabytes) are managed only by NASA, Google, and a few others. I can't wait to see the third edition!
By Peter Norvig
MG gave a good introduction to the components of practical Information Retrieval (IR). You can clearly see that the authors have a genuine interest in the field! But, I would like some more theoretical analysis of the algorithms used(i.e. O-notation), and more focus on parallell implementations of IR systems. Another book related to the same area worth mentioning is "Modern Information Retrieval".
By Amund Tveit
Download Link 1 -
Download Link 2