close
Skip to content

Add jq language#5233

Merged
lildude merged 4 commits into
github-linguist:masterfrom
wader:jq
Mar 8, 2021
Merged

Add jq language#5233
lildude merged 4 commits into
github-linguist:masterfrom
wader:jq

Conversation

@wader
Copy link
Copy Markdown
Contributor

@wader wader commented Feb 23, 2021

https://stedolan.github.io/jq/

Description

Language color is based on code example color from jq documentation.

Checklist:

@wader wader requested a review from a team as a code owner February 23, 2021 21:12
@wader
Copy link
Copy Markdown
Contributor Author

wader commented Feb 23, 2021

Hi, I will probably need some help with heuristics to distinguish between jsoniq and jq, they both use .jq. Also i'm not sure I did the correct thing and added jq as source.jq.2 as source.jq is already used by jsoniq. I also did a fork of https://github.com/bencampion/language-jq to https://github.com/wader/language-jq to be able to get scope name source.jq.2.

@wader wader mentioned this pull request Feb 23, 2021
Comment thread lib/linguist/languages.yml Outdated
@wader wader force-pushed the jq branch 5 times, most recently from c693000 to 305ee86 Compare February 25, 2021 12:02
@wader
Copy link
Copy Markdown
Contributor Author

wader commented Feb 25, 2021

Nice work on tests for this repo 👍 catches a lot of things!

Comment thread lib/linguist/languages.yml Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be

Suggested change
Jq:
JQ:

instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do, what is the convention? i think jq commonly is all lowercase but it seems the yml file uses captial keys?

Copy link
Copy Markdown
Collaborator

@Alhadis Alhadis Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a language's name uses non-standard or unusual casing (as jq does), then we use that; otherwise, Label Case is used (unless it's an obvious acronym, like HTML).

Aside from renaming the YAML entry, the samples/Jq directory will need to be renamed to use the updated capitalisation. Now, committing this might be tricky if you're using an OS with a case-insensitive filesystem (e.g., macOS). Running the following commands from the root of your fork should do the trick:

git config --local --add core.ignorecase false
git mv samples/Jq samples/JQ

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bloody hell, I gave you the wrong capitalisation. 😰 That last line should have been

git mv samples/Jq samples/jq

So you'll now have to run

git mv samples/JQ samples/jq

Apologies for the screw-up. 😞

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, should be "jq" as yml key also?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed that last question.

Yes. Everywhere the language's name appears should match the exact characters used in languages.yml: Heuristics, tests, the name of the samples/ subdirectory, the grammar-index, etc.

The only exception to this is the F* language, which necessitated a new field that allowed us to use filename-safe characters for `./samples/Fstar/. However, this is an edge-case, and one you'll almost certainly never have to worry or care about.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i assumed so, so already changed it

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 2, 2021

Just curious, i haven't specified a path to the syntax file in the submodule, will be derived automatically somehow?

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 2, 2021

Just curious, i haven't specified a path to the syntax file in the submodule, will be derived automatically somehow?

Aye. This should give you the gist of where Linguist looks for grammar files.

[…] as source.jq is already used by jsoniq

Are you sure that's for a different format also named JQ? If not, we can use the JSONIQ repository's existing source.jq grammar.

@wader wader force-pushed the jq branch 2 times, most recently from 3c36bd7 to 9503e1d Compare March 2, 2021 21:35
@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 2, 2021

[…] as source.jq is already used by jsoniq

Are you sure that's for a different format also named JQ? If not, we can use the JSONIQ repository's existing source.jq grammar.

They have similar syntax and similar uses so not surprised it's confusing :) It's probably the reason linguist current is mis-detect some jq syntax as JSONIQ.

Compare:
https://github.com/stedolan/jq/blob/master/src/builtin.jq
vs
https://github-lightshow.herokuapp.com/?utf8=%E2%9C%93&scope=from-url&grammar_format=auto&grammar_url=https%3A%2F%2Fraw.githubusercontent.com%2Fwader%2Flinguist-language-jq%2Fmaster%2Fgrammars%2Fjq.tmLanguage.json&grammar_text=&code_source=from-url&code_url=https%3A%2F%2Fraw.githubusercontent.com%2Fstedolan%2Fjq%2Fmaster%2Fsrc%2Fbuiltin.jq&code=

So not sure what to do about the scope name, not great. Would be good in the future to be able to point the vendor submodule to a syntax file used for vscode/atom etc but extensions for them seem to use source.jq for both jq and JSONIQ.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 2, 2021

Do we have to add some heuristics because they are so similar?

@wader wader force-pushed the jq branch 5 times, most recently from 78d995a to c3df97f Compare March 2, 2021 22:08
@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 2, 2021

So not sure what to do about the scope name, not great

I'd submit a PR to the JSONiq repository to change the scope-name to source.jsoniq. I don't understand why the author picked source.jq as a unique identifier for their JSONiq grammar…

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 2, 2021

Do we have to add some heuristics because they are so similar?

I think so. Try /^def[ \t]+\w+(?:\(.*?\))?:/ for jq.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 2, 2021

So not sure what to do about the scope name, not great

I'd submit a PR to the JSONiq repository to change the scope-name to source.jsoniq. I don't understand why the author picked source.jq as a unique identifier for their JSONiq grammar…

Mm 😞 i guess they picked it based on extension name, both use .jq 😄

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 2, 2021

Some unique things for JSONIQ, uses (: ... :) for comments, := assignment and {| ... |} for scopes

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 3, 2021

def can occur anywhere in a jq pipeline

That's fine. We don't need heuristics to be exhaustive and airtight, unless we're distinguishing between very similar languages (e.g., C / C++ / Objective-C / Objective C++). Otherwise, a heuristic need only match commonly-encountered constructs that cover the lion's share of real-world uses, and do so accurately.

Don't forget, there's still the classifier to fall back on when heuristics (and every other strategy) fails to match a language.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 3, 2021

@Alhadis let me know when i should update things. I guess we should merge a rename and update of JSONiq submodule first?

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 3, 2021

@wader See wcandillon/language-jsoniq#24. I couldn't resist fixing some glaring mistakes in the JSONiq grammar (though I left the XQuery grammar alone). We'll wait for it to get merged, then we can bump the grammar. 👍

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 3, 2021

@wader See wcandillon/language-jsoniq#24. I couldn't resist fixing some glaring mistakes in the JSONiq grammar (though I left the XQuery grammar alone). We'll wait for it to get merged, then we can bump the grammar. 👍

Ok 👍 does the jq grammar look ok? i've noticed that it do allow quite a lot of things that are not valid jq, eg 1 || 1 but maybe it's hard to describe syntax exactly.

I noticed a merge commit got pushed, should i stay away from rebasing and just add more commits now?

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 3, 2021

I noticed a merge commit got pushed

Aye, you'll want to update your branch locally by pulling from https://github.com/wader/linguist/tree/jq. 😉

i've noticed that it do allow quite a lot of things that are not valid jq, eg 1 || 1 but maybe it's hard to describe syntax exactly.

That's fine. 👍 Syntax highlighting isn't the place for validating semantics; only stuff that's clearly illegal syntax really deserves to be marked as an error.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 3, 2021

Created a PR with the fixes wader/language-jq#4

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 3, 2021

@AlaineN Is linguist used in highlight code in github flavoured markdown?

@AlaineN
Copy link
Copy Markdown

AlaineN commented Mar 3, 2021

@AlaineN Is linguist used in highlight code in github flavoured markdown?

I think you may be intending to tag someone else.

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 3, 2021

I think you may be intending to tag someone else.

I think I might be that "someone else". 😁 @Alhadis, @AlaineN… close enough for autocomplete to get it wrong. Right, @lildud?

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 3, 2021

@AlaineN @Alhadis Is linguist used in highlight code in github flavoured markdown?

Alright, two things:

  1. Linguist actually doesn't do the highlighting; it simply selects the grammar which GitHub's highlighting engine (something called "PrettyLights") will use to make code look spiffy on GitHub.com
  2. I assume that by "GitHub-flavoured markdown", you're referring to fenced code-blocks. In which case, yes, the same grammar is used to highlight embedded code samples. However, this isn't limited to Markdown: other formats supported on GitHub (like AsciiDoc) utilise a similar feature.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 3, 2021

@AlaineN Is linguist used in highlight code in github flavoured markdown?

I think you may be intending to tag someone else.

Oj sorry for that, not sure how your name got into the auto completer

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 3, 2021

not sure how your name got into the auto completer

She's a GitHub staff member, and Linguist is hosted on the @github org, so... I imagine that's all it takes to be listed as an autocomplete suggestion.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 5, 2021

Getting close to merge? what happens after merge, will take some time until used by official github and jq-files being detected as jq?

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 6, 2021

Getting close to merge? what happens after merge, will take some time until used by official github and jq-files being detected as jq?

Hey @wader, sorry for the slow response.

Yes, this LGTM. 👍 As for merging, changes made to Linguist only take effect on GitHub once there's been a new release of Linguist. Hence you won't see the improved highlighting on the site straight away.

I should also point out that approval from a GitHub staff member is required before changes can be made to a production dependency, so we need to wait for @lildude to give his 👍 of approval.

@wader
Copy link
Copy Markdown
Contributor Author

wader commented Mar 6, 2021

Ok! and as i understand submodules are updated regularly in linguist so no need to send PR if the syntax is updated?

And thanks again for all the help!

@Alhadis
Copy link
Copy Markdown
Collaborator

Alhadis commented Mar 6, 2021

Ok! and as i understand submodules are updated regularly in linguist so no need to send PR if the syntax is updated?

Grammars are all bumped to the latest versions right before a release is cut. Only in peculiar scenarios (such as this one) is it necessary to manually bump a grammar ourselves.

Copy link
Copy Markdown
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks 🙇

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants