Skip to content

Consider hiding sample files from language statistics #346

@rdipardo

Description

@rdipardo

GitHub currently identifies this repository's primary programming language as Lean, because the one and only tracked file in that language has a disproportionately large byte count:

userDefinedLanguages

These stats are generated by github-linguist and locally reproducible:

$ cd userDefinedLanguages
$ github-linguist 

35.94%  45960      Lean
34.38%  43967      Python
9.80%   12531      Perl
7.77%   9934       LSL
3.32%   4250       Vim Script
2.69%   3436       Bicep
2.08%   2662       RobotFramework
1.76%   2248       E
0.80%   1019       Assembly
0.47%   601        Cython
0.47%   598        PowerShell
0.38%   489        YARA
0.16%   202        Max

They can be filtered by adding specially formatted rules to a .gitattributes file, e.g.,

# .gitattributes
# TODO: add EOL controls

UDL-samples/**    -linguist-detectable

After commiting this change, the actually used programming languages come into much clearer focus:

77.01%  43967      Python
21.95%  12531      Perl
1.05%   598        PowerShell

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions