# Rabin fingerprint

The **Rabin fingerprinting scheme** is a method for implementing fingerprints using polynomials over a finite field. It was proposed by Michael O. Rabin.^{[1]}

## SchemeEdit

Given an *n*-bit message *m*_{0},...,*m*_{n-1}, we view it as a polynomial of degree *n*-1 over the finite field GF(2).

We then pick a random irreducible polynomial of degree *k* over GF(2), and we define the fingerprint of the message *m* to be the remainder after division of by over GF(2) which can be viewed as a polynomial of degree *k*-1 or as a *k*-bit number.

## ApplicationsEdit

Many implementations of the Rabin–Karp algorithm internally use Rabin fingerprints.

The *Low Bandwidth Network Filesystem* (LBFS) from MIT uses Rabin fingerprints to implement variable size shift-resistant blocks.^{[2]}
The basic idea is that the filesystem computes the cryptographic hash of each block in a file. To save on transfers between the client and server,
they compare their checksums and only transfer blocks whose checksums differ. But one problem with this scheme is that a single insertion at the beginning of the file will cause every checksum to change if fixed-sized (e.g. 4 KB) blocks are used. So the idea is to select blocks not based on a specific offset but rather by some property of the block contents. LBFS does this by sliding a 48 byte window over the file and computing the Rabin fingerprint of each window. When the low 13 bits of the fingerprint are zero LBFS calls those 48 bytes a breakpoint and ends the current block and begins a new one. Since the output of Rabin fingerprints are pseudo-random the probability of any given 48 bytes being a breakpoint is (1 in 8192). This has the effect of shift-resistant variable size blocks. *Any* hash function could be used to divide a long file into blocks (as long as a cryptographic hash function is then used to find the checksum of each block): but the Rabin fingerprint is an efficient rolling hash, since the computation of the Rabin fingerprint of region *B* can reuse some of the computation of the Rabin fingerprint of region *A* when regions *A* and *B* overlap.

Note that this is a problem similar to that faced by rsync.

## See alsoEdit

## ReferencesEdit

**^**Michael O. Rabin (1981). "Fingerprinting by Random Polynomials" (PDF). Center for Research in Computing Technology, Harvard University. Tech Report TR-CSE-03-01. Retrieved 2007-03-22. Cite journal requires`|journal=`

(help)**^**Athicha Muthitacharoen, Benjie Chen, and David Mazières "A Low-bandwidth Network File System"

## External linksEdit

- Andrei Z. Broder (1993). "Some applications of Rabin's fingerprinting method". Retrieved 2011-09-12. Cite journal requires
`|journal=`

(help) - David Andersen (2007). "Exploiting Similarity for Multi-Source Downloads using File Handprints". Retrieved 2007-04-12. Cite journal requires
`|journal=`

(help) - Ross N. Williams (1993). "A painless guide to CRC Error detection algorithms".

This cryptography-related article is a stub. You can help Wikipedia by expanding it. |