This document describes the extraction and transformation module of an FDA web content mining project. It discusses using Selenium with Java to extract content from the FDA website, including news listings, dates, and links to PDFs and ZIP files. XPath is used to locate elements on pages when IDs are not available. The execution flow starts a web driver session, navigates to the news page, collects listings and nested links into a hash map, then extracts linked content and details into files and a CSV.