-
Notifications
You must be signed in to change notification settings - Fork 5k
Add support for .nls extension in NetLogo
#7518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that the
.nlsextension is unique—no other language inlanguages.ymluses it.
And that's going to cause a problem as there appear to be more non-NetLogo files than there are NetLogo (add NOT before the keyword in your search) which means they will all be incorrectly classified.
You will need to identity the other main user of this extension and add support at the same time in this PR and use a heuristic to differentiate the two.
|
Hi @lildude, Thanks for pointing out the issue. I've added heuristics to distinguish Let me know if any further changes are needed. |
Alhadis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I question the need for heuristics in the first place, as our classifier should do a decent enough job disambiguating between the filetypes. If not, more (and better-quality) samples are needed.
Regardless, the feedback I've left relates specifically to the accuracy and formatting of the heuristics you've added.
lib/linguist/heuristics.yml
Outdated
| pattern: | ||
| - '^\s*;' | ||
| - '^\s*to\s+[\w-]+' | ||
| - '^\s*to-report\s+[\w-]+' | ||
| - '^\s*__includes\s+\[' | ||
| - '^\s*extensions\s+\[' | ||
| - '^\s*globals\s+\[' | ||
| - '^\s*breed\s+\[' | ||
| - '^\s*turtles-own\s+\[' | ||
| - '^\s*patches-own\s+\[' | ||
| - '^\s*links-own\s+\[' | ||
| - '^\s*undirected-link-breed\s+\[' | ||
| - '^\s*directed-link-breed\s+\[' | ||
| - '^\s*ask\s+[\w-]+\s+\[' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Many of these heuristics are extremely open-ended, and could very easily match a valid line of a TeX file or INI. In particular, something like
^\s*;is likely to match a comment-line in an INI file (here are three such examples I plucked from a quick search, for example).Even less obvious forms such as this might be matched by
^\s*ask\s+[\w-]+\s*\[(remember,\sincludes newlines as well as horizontal whitespace):notice = Fpr help with regular expressions, ask Alhadis [section] Foo = Bar
-
Secondly, these patterns (problematic as they are) can be more efficiently written as a single, combined expression in expanded
(?x)mode:Suggested changepattern: - '^\s*;' - '^\s*to\s+[\w-]+' - '^\s*to-report\s+[\w-]+' - '^\s*__includes\s+\[' - '^\s*extensions\s+\[' - '^\s*globals\s+\[' - '^\s*breed\s+\[' - '^\s*turtles-own\s+\[' - '^\s*patches-own\s+\[' - '^\s*links-own\s+\[' - '^\s*undirected-link-breed\s+\[' - '^\s*directed-link-breed\s+\[' - '^\s*ask\s+[\w-]+\s+\[' pattern: >- (?x) ^ \s* ( ; | to(-report)? \s+ [\w-]+ | ask \s+ [\w-]+ \s+ \[ | (extension|global|__include)s \s+ \[ | (turtles|patches|links)-own \s+ \[ | ((un)?directed-link-)?breed \s+ \[ )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Alhadis,
Thank you for your review.
Regarding your points:
- I have no problem removing the heuristics. I didn't include them initially but added them at @lildude's request.
- I removed the
^\s*;pattern, which, as you pointed out, was too open-ended, and used the(?x)mode as suggested to combine all patterns into a single expression. I made a few modifications to the regex you proposed. - I maintained
\safter the keyword because NetLogo, like other formats, allows the opening bracket on a new line (example). I removed theaskkeyword to avoid confusion. Since the first\sis preceded by^, I don't think it needs to be changed. - I also added more examples, all with commercial use permissions.
The PR description has been updated to reflect these changes.
Description
Hi there,
I’d like to propose adding the
.nlsextension to the NetLogo language inlinguist. This is an official NetLogo file type, but it's currently missing fromlanguages.yml.I've added heuristics to differentiate
.nlsfiles as NetLogo, TeX, or INI. Each pattern was tested individually, and all of them return results associated with the extension.Thanks for considering this!
Checklist:
.nls(All | ~5.5k files): https://github.com/search?q=NOT+is%3Afork+path%3A*.nls&type=code&utf8=%E2%9C%93.nls(NetLogo patterns | ~2k files): https://github.com/search?q=NOT+is%3Afork+path%3A*.nls+%2F%5E%5Cs*%28to%28-report%29%3F%5Cs%2B%5Cw%5B%5Cw-%5D*%7C%28extension%7Cglobal%7C__include%29s%5Cs*%5C%5B%7C%28turtles%7Cpatches%7Clinks%29-own%5Cs*%5C%5B%7C%28%28un%29%3Fdirected-link-%29%3Fbreed%5Cs*%5C%5B%29%2F&type=code&utf8=%E2%9C%93.nls(TeX patterns | 187 files): https://github.com/search?q=NOT+is%3Afork+path%3A*.nls+%2F%5E%5Cs*%28%5C%5Cbegin%5C%7B%7C%5C%5Cend%5C%7B%7C%5C%5Cnomgroup%5C%7B%7C%5C%5Citem%29%2F&type=code&utf8=%E2%9C%93.nls(INI patterns NOT TeX NOT NetLogo | ~2.7k files): https://github.com/search?q=NOT+is%3Afork+path%3A*.nls+%2F%5E%5Cs*%28%5B%5Cw%23-%5D%2B%5Cs*%3D%7CSTRINGTABLE%7CLOCALE_%7CLGRPID_%29%2F+NOT+%2F%5E%5Cs*%28%5C%5Cbegin%5C%7B%7C%5C%5Cend%5C%7B%7C%5C%5Cnomgroup%5C%7B%7C%5C%5Citem%29%2F+NOT+%2F%5E%5Cs*%28to%28-report%29%3F%5Cs%2B%5Cw%5B%5Cw-%5D*%7C%28extension%7Cglobal%7C__include%29s%5Cs*%5C%5B%7C%28turtles%7Cpatches%7Clinks%29-own%5Cs*%5C%5B%7C%28%28un%29%3Fdirected-link-%29%3Fbreed%5Cs*%5C%5B%29%2F&type=code&utf8=%E2%9C%93%3E.nls(NOT NetLogo | ~3.5K files): https://github.com/search?q=NOT+is%3Afork+path%3A*.nls+NOT+%2F%5E%5Cs*%28to%28-report%29%3F%5Cs%2B%5Cw%5B%5Cw-%5D*%7C%28extension%7Cglobal%7C__include%29s%5Cs*%5C%5B%7C%28turtles%7Cpatches%7Clinks%29-own%5Cs*%5C%5B%7C%28%28un%29%3Fdirected-link-%29%3Fbreed%5Cs*%5C%5B%29%2F&type=code&utf8=%E2%9C%93%3E0_init.nls(NetLogo): https://github.com/WKSu/compound-events/blob/30a990574b2e2f0fe84b72492ed66ef40844a122/code/0_init.nlsconfig-reader.nls(NetLogo): https://github.com/swarmfabsim/swarmfabsim.github.io/blob/418686c8eedc9d8d39ef4c6907f3c7690360477f/src/config-reader.nlsnodes.nls(NetLogo): https://github.com/ric-colasanti/TRIFIC/blob/691a8f05e0facf21383df381681d040c9c2e9abd/nodes.nlsoutput.nls(NetLogo): https://github.com/harrykipper/covid/blob/3593a2e73df8764dad1625a4e3f87fc3384b07dd/output.nlsparameters.nls(NetLogo): https://github.com/mess-nlesc/model/blob/303d91fc0756c603e4c7396cd10bbc8fed3a3c43/model/parameters.nlspolice.nls(NetLogo): https://github.com/bkalthoff/DistIntSys20/blob/f5eaf221063548a55fee5088c4d41c9e00e7bc2a/police.nlsroads.nls(NetLogo): https://github.com/ric-colasanti/TRIFIC/blob/691a8f05e0facf21383df381681d040c9c2e9abd/Dev/roads.nlssetup-map.nls(NetLogo): https://github.com/sustentarea/logoclim/blob/adcd28753718c143540615cad46b9983524c0d74/nlogox/nls/setup-map.nlssetup-procedures.nls(NetLogo): https://github.com/comses/megadapt/blob/6230bb4710ed4326e1298edf4788d8fb84bcbf20/src/netlogo/setup-procedures.nls#L81show-values.nls(NetLogo): https://github.com/danielvartan/nlogo-utils/blob/48a61c93da6d8469703b36540abe9b81c6b373b1/nlogo/show-values.nlsedengths.nls(TeX): https://github.com/fneum/ev_chargingcoordination2017/blob/581cd3879af85d269d38f446556a9ea600e87457/docs/latex_dissertation/edengths.nlsHauptdatei.nls(TeX): https://github.com/PrinceSimple/Seminararbeit_Generative_Modelle/blob/f4cecddea929c4f3e362cc687ad968e58ed754cc/Hauptdatei.nlsmain.nls(TeX): https://github.com/ShevonKuan/Engineering-mechanics/blob/c80e4ee494bd9fed191ed590e07e7748d560d666/main.nlsmyThesis.nls(TeX): https://github.com/jiec827/njustThesis/blob/bc30a8096b649a79367aa2f0ba29534401463705/myThesis.nlstcc-ifbaiano.nls(TeX): https://github.com/SavioKSLopes/TCC/blob/ebce4fff70634c0d2503781a9885fbb1f00ccf81/tcc-ifbaiano.nlsben.nls(INI): https://github.com/reactos/wine/blob/2e8dfbb1ad71f24c41e8485a39df01bb9304127f/dlls/kernel32/nls/ben.nlsen-us.nls(INI): https://github.com/WinXP655/SimpleEdit/blob/40848407151ee2852828f773c48fe5ef30940b62/locales/en-us.nlsena.nls(INI): https://github.com/kode54/wine/blob/503f5fa59579965eea984b09bb3f72ce03a2bf93/dlls/kernel32/nls/ena.nlsj9bcv.nls(INI): https://github.com/govindsaju/openj9/blob/ed17fbe632f3dd14cb75607ba26bc7759bc6c959/runtime/nls/vrfy/j9bcv.nlssettings.nls(INI): https://github.com/Briclyaz/NLSound_module_QCom/blob/11a0eefd0a3b275493080e29862f9d61ef31e12d/settings.nls0_init.nls(NetLogo): GNU General Public License v3.0 (https://github.com/WKSu/compound-events/blob/30a990574b2e2f0fe84b72492ed66ef40844a122/LICENSE)config-reader.nls(NetLogo): Creative Commons Attribution 4.0 International (https://github.com/swarmfabsim/swarmfabsim.github.io/blob/418686c8eedc9d8d39ef4c6907f3c7690360477f/license.txt)nodes.nls(NetLogo): Apache License 2.0 (https://github.com/ric-colasanti/TRIFIC/blob/691a8f05e0facf21383df381681d040c9c2e9abd/LICENSE)output.nls(NetLogo): GNU General Public License v3 (https://github.com/harrykipper/covid/blob/3593a2e73df8764dad1625a4e3f87fc3384b07dd/LICENSE)parameters.nls(NetLogo): Apache License 2.0 (https://github.com/mess-nlesc/model/blob/303d91fc0756c603e4c7396cd10bbc8fed3a3c43/LICENSE)police.nls(NetLogo): MIT License (https://github.com/bkalthoff/DistIntSys20/blob/f5eaf221063548a55fee5088c4d41c9e00e7bc2a/LICENSE)roads.nls(NetLogo): Apache License 2.0 (https://github.com/ric-colasanti/TRIFIC/blob/691a8f05e0facf21383df381681d040c9c2e9abd/LICENSE)setup-map.nls(NetLogo): GNU General Public License v3 (https://github.com/sustentarea/logoclim/blob/adcd28753718c143540615cad46b9983524c0d74/LICENSE.md)setup-procedures.nls(NetLogo): GNU General Public License v3 (https://github.com/comses/megadapt/blob/6230bb4710ed4326e1298edf4788d8fb84bcbf20/LICENSE)show-values.nls(NetLogo): Creative Commons CC0 1.0 Universal License (https://github.com/danielvartan/logoutils/blob/48a61c93da6d8469703b36540abe9b81c6b373b1/LICENSE.md)edengths.nls(TeX): GNU General Public License v3.0 (https://github.com/fneum/ev_chargingcoordination2017/blob/581cd3879af85d269d38f446556a9ea600e87457/LICENSE)Hauptdatei.nls(TeX): MIT License (https://github.com/PrinceSimple/Seminararbeit_Generative_Modelle/blob/f4cecddea929c4f3e362cc687ad968e58ed754cc/LICENSE)main.nls(TeX): GNU General Public License v3 (https://github.com/ShevonKuan/Engineering-mechanics/blob/c80e4ee494bd9fed191ed590e07e7748d560d666/LICENSE)myThesis.nls(TeX): GNU General Public License v2 (https://github.com/jiec827/njustThesis/blob/bc30a8096b649a79367aa2f0ba29534401463705/LICENSE)tcc-ifbaiano.nls(TeX): BSD 3-Clause "New" or "Revised" License (https://github.com/SavioKSLopes/TCC/blob/ebce4fff70634c0d2503781a9885fbb1f00ccf81/LICENSE)ben.nls(INI): GNU General Public License v2 (https://github.com/reactos/wine/blob/2e8dfbb1ad71f24c41e8485a39df01bb9304127f/dlls/kernel32/nls/ben.nls#L7)en-us.nls(INI): MIT License (https://github.com/WinXP655/SimpleEdit/blob/40848407151ee2852828f773c48fe5ef30940b62/LICENSE)ena.nls(INI): GNU Lesser General Public License v2.1 (https://github.com/kode54/wine/blob/503f5fa59579965eea984b09bb3f72ce03a2bf93/LICENSE)j9bcv.nls(INI): Apache License 2.0 (https://github.com/govindsaju/openj9/blob/ed17fbe632f3dd14cb75607ba26bc7759bc6c959/runtime/nls/vrfy/j9bcv.nls#L7)settings.nls(INI): GNU General Public License v2 (https://github.com/Briclyaz/NLSound_module_QCom/blob/11a0eefd0a3b275493080e29862f9d61ef31e12d/LICENSE)