I was given the task to add an input box to a form that will allow HTML snippets; these snippets can be a variety of types but of course exclude script tags. I wrote a line of code that validates HTML but it failed QA testing since plain text was allowed to be added to the form.
This check_html method validates the HTML provided, but it was still allowing plain text, which is a functionality I had to circumvent. I finally came across a hacky solution which was to parse the content using an XML call. After 90 minutes of deliberation I came to this solution:
*note: I was checking for image tags because the XML call specifically disregards the "<img>" tags and requires "<image>", but this worked because it disallowed plain text to be input. I was later informed by my peer that not all XML was valid html and there would not stand the chance of time, I did say this was a hacky solution.
Not proud of this, I submitted my code for peer review and was quickly informed that my attempt was noted but plain text is valid html... Not taking any formal "HTML" course in college I had not known of this fact; All that work to only eventually find out that I was trying to invalidate something that was technically valid.
I was then provided this solution thanks to code review:
Nokogiri parse the HTML and separates each element tag into children and nodes. This code checks to see if thre are any children that have nodes with plain text and finally invalidates those. Like the intro to Star Trek: Enterprise, "It's been a long road".
No comments:
Post a Comment