A cool old paper - FCN text detector
They were using multi-layer masks for better semantic segmentation supervision before it was mainstream.
Too bad such models are a commodity now, you can just use pre-trained)
Previous approaches for scene text detection have already achieved promising performances across various benchmarks. However, they usually fall short when dealing with challenging scenarios, even...