This article discusses a novel algorithm for filtering artificially duplicated reads from 454 pyrosequencing data. The algorithm operates directly on the flowgram data without converting to base calls. It uses a hierarchical clustering approach in "flow space" to group similar flowgrams. Benchmarking on a large 454 sequencing dataset showed the algorithm effectively removed duplicates while maintaining a high Jaccard index compared to clustering based on mapped reads. Removing duplicate reads improved metrics of genome assemblies such as N50.