Page MenuHomePhabricator

[lib] Set up RegEx to match text between two spoiler characters
ClosedPublic

Authored by rohan on Oct 10 2022, 12:38 PM.
Tags
None
Referenced Files
F3374794: D5335.id17522.diff
Tue, Nov 26, 5:07 PM
F3368559: D5335.id17462.diff
Mon, Nov 25, 8:14 PM
Unknown Object (File)
Sun, Nov 24, 10:56 PM
Unknown Object (File)
Sun, Nov 24, 12:35 PM
Unknown Object (File)
Sun, Nov 24, 12:35 PM
Unknown Object (File)
Thu, Nov 7, 8:54 PM
Unknown Object (File)
Thu, Nov 7, 7:43 PM
Unknown Object (File)
Thu, Nov 7, 7:25 PM
Subscribers

Details

Summary

Set up the RegEx to be able to match characters between spoiler characters (for now, the spoiler characters are > and <, so any text between >these two symbols< will be matched by the RegEx. This is for https://linear.app/comm/issue/ENG-1981/set-up-regex-to-match-text-between-two-spoiler-characters, and subsequent parts of the task will be stacked on top of this diff.

Test Plan

Tested the RegEx on a random sentence with the two characters.
The result:

[NODEM] [
[NODEM]   '>spoilers here<',
[NODEM]   'spoilers here',
[NODEM]   index: 31,
[NODEM]   input: 'This is my testing phrase with >spoilers here<',
[NODEM]   groups: undefined
[NODEM] ]

Also a link to an interactive playground with the regex demo: https://regex101.com/r/EC2mJt/1

Diff Detail

Repository
rCOMM Comm
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Nice, looks good. I'm assuming you already spoke to atul/ashoat about using >SPOILER< to mark spoilers. I know Discord uses ||SPOILER|| to mark spoilers and some other places use >! SPOILER to mark spoilers

Doesn't seem like spoiler is included in either the CommonMark or GitHub Flavored Markdown spec (which AFAIK are the two "big ones").

So I guess it's up to us to decide what we want to do... looks like reddit does >! SPOILER !< in addition to the ones @ginsu mentioned. Maybe we support all of them to reduce the learning curve for new users? Not sure

CC @ashoat to get his thoughts

Tested the RegEx on a random sentence with the two characters.

Thoughts on adding some unit tests for this?

In D5335#157437, @atul wrote:

Tested the RegEx on a random sentence with the two characters.

Thoughts on adding some unit tests for this?

Sure, happy to add unit tests, I can do that a bit after as a separate task once we decide on the format for spoilers - originally messaged on Comm, but this might be a better place so others can contribute on what they think.

Looks like reddit does >! SPOILER !< in addition to the ones @ginsu mentioned. Maybe we support all of them to reduce the learning curve for new users? Not sure

I think there are a few ways to do this, I think something that will be easy on mobile will make the most sense since a more complicated pattern could make it harder for mobile users. Both >! SPOILER !< and || SPOILER || seem fine to me based on Ginsu's suggestion - my opinion is the last one is a bit easier from a user perspective.

Temporarily setting the spoiler RegEx to match text between || and ||, since using > causes problems with other Markdown. Waiting on final confirmation still on how we want to detect spoiler text.

https://regex101.com/r/jGdcab/1

ashoat requested changes to this revision.Oct 10 2022, 3:47 PM

Sending back with a question. Agree with this decision on the format given overlap with quotation regex

Sure, happy to add unit tests, I can do that a bit after as a separate task once we decide on the format for spoilers

Can you create a Linear task for this and link it here?

lib/shared/markdown.js
253 ↗(On Diff #17462)

What's the question mark doing here?

This revision now requires changes to proceed.Oct 10 2022, 3:47 PM

Sure, here's the task: https://linear.app/comm/issue/ENG-2000/add-unit-tests-to-test-the-regex-for-spoiler-text
Will update it with some more details

lib/shared/markdown.js
253 ↗(On Diff #17462)

I wanted a lazy evaluation of the text in order to allow for multiple spoilers within a text. As opposed to a greedy evaluation where it will go to the last instance of || in the sentence.

So from my understanding (feel free to correct me here):

Lazy: This is some ||spoiler text|| and this is ||more text to hide||.
Match: spoiler text AND more text to hide

Greedy: This is some ||spoiler text|| and this is ||more text to hide||.
Match: spoiler text|| and this is ||more text to hide

Makes perfect sense, thank you!!

This revision now requires review to proceed.Oct 11 2022, 5:41 AM
This revision is now accepted and ready to land.Oct 11 2022, 5:44 AM

Add a caret to the spoiler regex to prevent simple-markdown errors and for
consistency with other regex expressions in the file