High Dynamic Range (HDR) content is becoming ubiquitous due to the rapid development of capture technologies. Neverthe-less, the dynamic range of common display devices is still limited, therefore tone mapping (TM) remains a key challenge forimage visualization. Recent work has demonstrated that neural networks can achieve remarkable performance in this task whencompared to traditional methods, however, the quality of the results of these learning-based methods is limited by the train-ing data. Most existing works use as training set a curated selection of best-performing results from existing traditional tonemapping operators (often guided by a quality metric), therefore, the quality of newly generated results is fundamentally limitedby the performance of such operators. This quality might be even further limited by the pool of HDR content that is used fortraining. In this work we propose a learning-based self-supervised tone mapping operator that is trained at test time specificallyfor each HDR image and does not need any data labeling. The key novelty of our approach is a carefully designed loss functionbuilt upon fundamental knowledge on contrast perception that allows for directly comparing the content in the HDR and tonemapped images. We achieve this goal by reformulating classic VGG feature maps into feature contrast maps that normalizelocal feature differences by their average magnitude in a local neighborhood, allowing our loss to account for contrast maskingeffects. We perform extensive ablation studies and exploration of parameters and demonstrate that our solution outperformsexisting approaches with a single set of fixed parameters, as confirmed by both objective and subjective metrics.