Organic molecules, known as dyes, which can absorb and emit light, have various potential applications, such as biomedical imaging, organic photovoltaics, non-linear optics, and quantum information systems. These applications are controlled by dye orientation and properties such as extinction coefficient, transition dipole moment, and aggregation ability. Dye aggregate networks via deoxyribonucleic acid (DNA) templating exhibit exciton delocalization, energy transport, and fluorescence emission. DNA nanotechnology provides scaffolding upon which dyes attach in an aqueous environment. To control the process and optimize the properties, a combination of machine learning, density functional theory (DFT), and time-dependent (TD) DFT was performed to screen more than 26,000 dyes, select ideal dye candidates, and determine dye structure-property relationships. The machine learning models were developed with an accuracy of above 90%. Top 15 dyes were identified due to their properties comparable to those of a reference dye - pentamethine indocyanine dye Cy5. Molecular dynamic (MD) simulations were then performed to reveal dye aggregate-DNA interactions and dye orientations. Simulation results agreed well with experimental observations. The developed data-driven and computational workflow for identifying dyes with large extinction coefficients and transition dipole moments and good aggregation ability is effective and can be used as a tool to develop new dyes for excitonic applications.