This document describes a student project that aims to develop an efficient firearms monitoring technique using deep learning to help build secure smart cities. The project uses Faster RCNN and EfficientDet models to detect guns and human faces in images. An ensemble approach is proposed that combines the outputs of the models to improve detection performance compared to the individual models. The proposed system, hardware requirements, algorithms, literature review and references are outlined in the document.