… Lubar Identifying which parts of a Web-page contain target content (e.g., the portion of an online news page that contains the actual article) is a significant problem that must be … ABSTRACT Identifying which parts of a Web-page contain target content (e.g., the portion of an online news page that contains the actual article) is a significant problem that must be … 3 Beautiful Soup, by Leonard Richardson and others: http://www.crummy.com/software/BeautifulSoup/ following: <a>, <ins>, <del>, <span>, <bdo>, <em>, <strong>, <dfn>, …