This paper introduces a novel trellis min-max algorithm for decoding non-binary low-density parity check (nb-ldpc) codes, which minimizes message exchanges between check and variable node processors, improving throughput while maintaining performance. The proposed architecture requires fewer resources compared to conventional methods, addressing issues of area coverage and low throughput in existing systems. The implementation demonstrates significant enhancements in both performance and efficiency using tools like Modelsim and Xilinx ISE.