In this paper, we propose a verifiable image transformation networks to transform face sketch to photo and vice versa. Face sketch-photo is very popular in computer vision applications. It has been used in some specific official departments such as law enforcement and digital entertainment. There are several existing face sketch-photo synthesizing methods that use feed-forward convolution neural networks; however, it is hard to assure whether the results of the methods are well mapped by depending only on loss values or accuracy results alone. In our approach, we use two Resnet encoder-decoder networks as image transformation networks. One is for sketch-photo and another is for photo-sketch. They depend on each other to verify their output results during training. For example, using photo-sketch transformation networks to verify the photo result of sketch-photo by inputting the result to the photosketch transformation networks and find loss between the reversed transformed result with ground-truth sketch. Likely, we can verify the sketch result as well in a reverse way. Our networks contain two loss functions such as sketch-photo loss and photo-sketch loss for the basic transformation stages and the other two-loss functions such as sketch-photo verification loss and photo-sketch verification loss for the verification stages. Our experiment results on CUFS dataset achieve reasonable results compared with the state-of-the-art approaches.