|
Computing two-way and multi-way set similarities is a fundamental problem. This study focuses on estimating 3-way resemblance (Jaccard similarity) using b-bit minwise hashing. While traditional minwise hashing methods store each hashed value using 64 bits, b-bit minwise hashing only stores the lowest b bits (where b>= 2 for 3-way). The extension to 3-way similarity from the prior work on 2-way similarity is technically non-trivial. We develop the precise estimator which is accurate and very complicated; and we recommend a much simplified estimator suitable for sparse data. Our analysis shows that $b$-bit minwise hashing can normally achieve a 10 to 25-fold improvement in the storage space required for a given estimator accuracy of the 3-way resemblance.
|
Video Length: 0
Date Found: March 26, 2011
Date Produced: March 25, 2011
|
|
VideoLectures |
July 10, 2011
The explosion in growth of the Web of Linked Data has provided, for the first time, a plethora of information in disparate locations, yet bound together by machine-readable, semantically typed relations. Utilisation of the Web of Data has been, until now, restricted to members of the community, ...
|
VideoLectures |
July 10, 2011
Problems cannot be solved with the mentality that has caused them’. Hence, the 2008- crisis cannot be solved with ethics of one-sided and short-term mentality of the industrial and neoliberal economics, which has caused the ‘Bubble Economy’ of several recent decades. Neither the market nor the ...
|
VideoLectures |
July 10, 2011
|
VideoLectures |
July 10, 2011
|
VideoLectures |
July 10, 2011
Social media presents unique challenges for topic classification, including the brevity of posts, the informal nature of conversations, and the frequent reliance on external hyperlinks to give context to a conversation. In this paper we investigate the usefulness of these external hyperlinks ...
|
|
|
|
|
|
Featured Content
Featuring websites that enhance the internet user’s experience.
|