The advent of super-resolution microscopy allows microstructures of foods to be explored in new depths, which when coupled with quantitative image analysis can provide a powerful analytical tool. Herein, a methodology is presented and applied to use a 2D spatial cross-correlation analysis to investigate the relative spatial arrangement of protein and fat in acid induced whole milk gels where the milk is either non-homogenised or has been homogenised at either 10 or 25 MPa. Two-channel images were taken using super-resolution Stimulated Emission Depletion (STED) microscopy and confocal microscopy. A term has been derived to extract the typical distance from the fat droplet surface and to the local maximum protein distribution. The fat droplet size is determined through 2D spatial autocorrelation analysis. Methods of analysis are applied to global images and to region specific analysis focussing on individual fat droplets. Cross-correlation analysis has been empirically validated using generated images with precise spatial features corresponding to the features of interest in true microscopy images, over appropriate length scales. The protein microstructure, fat droplet size and distances between the fat droplets and protein network are characterised. There are significantly different distances between the fat droplets and protein network in the homogenised samples compared to the non-homogenised sample. The extracted separation distances are below the diffraction limit of light, highlighting the utility of super-resolution imaging.