Skip to content

Conversation

@jeffhostetler
Copy link

@jeffhostetler jeffhostetler commented Jan 15, 2019

V7 addresses:
[] squashes in the fixup for windows process ancestry calculation.
[] adds optional socket_type parameter to the unix domain socket syntax.
[] allow unix domain socket syntax to try both stream and dgram if socket_type not specified.
[] add Perl prereq checks for JSON:PP in trace2 tests.
[] minor clang-format cleanup.

Thanks
Jeff

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

@dscho
Copy link
Member

dscho commented Jan 16, 2019

For the record, t6042-merge-rename-corner-cases.sh seems to be flakey on Windows (and this is the reason why the "CI for GitGitGadget's Git fork" build failed).

@jeffhostetler
Copy link
Author

@dscho Thanks for the info. I'll give it another chance later after I do a little cleanup and re-push.

@dscho
Copy link
Member

dscho commented Jan 16, 2019

You're welcome. I submitted #109 to fix it.

@jeffhostetler jeffhostetler force-pushed the core-trace2-2019-v0 branch 3 times, most recently from f1855e5 to 0d53e0e Compare January 17, 2019 21:16
@dscho
Copy link
Member

dscho commented Jan 18, 2019

From the build log:

2019-01-18T01:25:12.3925336Z compat/win32/ancestry.c:3:10: fatal error: Psapi.h: No such file or directory
2019-01-18T01:25:12.3981024Z #include <Psapi.h>
2019-01-18T01:25:12.4019672Z ^~~~~~~~~

... whoops ;-)

This is a problem I introduced in my endeavor to minimize the subset of the Git for Windows SDK used for the continuous testing. I added that header to the sparse checkout file and am rebuilding the minimal SDK here: https://git-for-windows.visualstudio.com/git/_build/results?buildId=29013. Once that is done, I'll trigger another CI for GitGitGadget's Git fork run here.

@dscho
Copy link
Member

dscho commented Jan 18, 2019

@jeffhostetler I started a rebuild with the updated minimal SDK.

@dscho
Copy link
Member

dscho commented Jan 18, 2019

I started a rebuild

There you go. All green.

git-for-windows-ci pushed a commit to git-for-windows/git-sdk-64 that referenced this pull request Jan 19, 2019
This header is used in gitgitgadget/git#108.

Signed-off-by: Johannes Schindelin <[email protected]>
git-for-windows-ci pushed a commit to git-for-windows/git-sdk-64 that referenced this pull request Jan 19, 2019
git-for-windows-ci pushed a commit to git-for-windows/git-sdk-64 that referenced this pull request Jan 19, 2019
All of these are needed in gitgitgadget/git#108.

Signed-off-by: Johannes Schindelin <[email protected]>
@jeffhostetler jeffhostetler changed the title WIP Trace2 tracing facility (V2) Trace2 tracing facility Jan 22, 2019
@jeffhostetler
Copy link
Author

/queue

@jeffhostetler
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 22, 2019

Submitted as [email protected]

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 24, 2019

This branch is now known as jh/trace2.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 24, 2019

This patch series was integrated into pu via git@cb3b9e7.

@gitgitgadget gitgitgadget bot added the pu label Jan 24, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Jan 25, 2019

On the Git mailing list, Josh Steadmon wrote (reply to this):

On 2019.01.22 13:22, Jeff Hostetler via GitGitGadget wrote:
> This patch series contains a greatly refactored version of my original
> Trace2 series [1] from August 2018.
> 
> A new design doc in Documentation/technical/api-trace2.txt (in the first
> commit) explains the relationship of Trace2 to the current tracing facility.
> Calls to the current tracing facility have not been changed, rather new
> trace2 calls have been added so that both continue to work in parallel for
> the time being.
> 
> [1] https://public-inbox.org/git/[email protected]/
> 
> Cc: [email protected]: [email protected]: [email protected]
> 
> Derrick Stolee (1):
>   pack-objects: add trace2 regions
> 
> Jeff Hostetler (13):
>   trace2: Documentation/technical/api-trace2.txt
>   trace2: create new combined trace facility
>   trace2: collect platform-specific process information
>   trace2:data: add trace2 regions to wt-status
>   trace2:data: add editor/pager child classification
>   trace2:data: add trace2 sub-process classification
>   trace2:data: add trace2 transport child classification
>   trace2:data: add trace2 hook classification
>   trace2:data: add trace2 instrumentation to index read/write
>   trace2:data: add subverb to checkout command
>   trace2:data: add subverb to reset command
>   trace2:data: add subverb for rebase
>   trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh
> 
>  Documentation/technical/api-trace2.txt | 1158 ++++++++++++++++++++++++
>  Makefile                               |   11 +
>  builtin/am.c                           |    1 +
>  builtin/checkout.c                     |    7 +
>  builtin/pack-objects.c                 |   12 +-
>  builtin/rebase.c                       |   19 +
>  builtin/receive-pack.c                 |    4 +
>  builtin/reset.c                        |    6 +
>  builtin/submodule--helper.c            |   11 +-
>  builtin/worktree.c                     |    1 +
>  cache.h                                |    1 +
>  common-main.c                          |   13 +-
>  compat/mingw.c                         |   11 +-
>  compat/mingw.h                         |    3 +-
>  compat/win32/ancestry.c                |  102 +++
>  config.c                               |    2 +
>  config.mak.uname                       |    2 +
>  connect.c                              |    3 +
>  editor.c                               |    1 +
>  exec-cmd.c                             |    2 +
>  git-compat-util.h                      |    7 +
>  git.c                                  |   65 ++
>  pager.c                                |    1 +
>  read-cache.c                           |   47 +-
>  remote-curl.c                          |    7 +
>  repository.c                           |    2 +
>  repository.h                           |    3 +
>  run-command.c                          |   63 +-
>  run-command.h                          |   17 +-
>  sequencer.c                            |    2 +
>  sh-i18n--envsubst.c                    |    3 +
>  sub-process.c                          |    1 +
>  submodule.c                            |   11 +-
>  t/helper/test-parse-options.c          |    3 +
>  t/helper/test-tool.c                   |    4 +
>  t/helper/test-tool.h                   |    1 +
>  t/helper/test-trace2.c                 |  273 ++++++
>  t/t0001-init.sh                        |    1 +
>  t/t0210-trace2-normal.sh               |  135 +++
>  t/t0210/scrub_normal.perl              |   48 +
>  t/t0211-trace2-perf.sh                 |  153 ++++
>  t/t0211/scrub_perf.perl                |   76 ++
>  t/t0212-trace2-event.sh                |  237 +++++
>  t/t0212/parse_events.perl              |  251 +++++
>  trace2.c                               |  809 +++++++++++++++++
>  trace2.h                               |  403 +++++++++
>  trace2/tr2_cfg.c                       |   92 ++
>  trace2/tr2_cfg.h                       |   19 +
>  trace2/tr2_dst.c                       |   90 ++
>  trace2/tr2_dst.h                       |   34 +
>  trace2/tr2_sid.c                       |   67 ++
>  trace2/tr2_sid.h                       |   18 +
>  trace2/tr2_tbuf.c                      |   32 +
>  trace2/tr2_tbuf.h                      |   23 +
>  trace2/tr2_tgt.h                       |  126 +++
>  trace2/tr2_tgt_event.c                 |  606 +++++++++++++
>  trace2/tr2_tgt_normal.c                |  331 +++++++
>  trace2/tr2_tgt_perf.c                  |  573 ++++++++++++
>  trace2/tr2_tls.c                       |  164 ++++
>  trace2/tr2_tls.h                       |   95 ++
>  trace2/tr2_verb.c                      |   30 +
>  trace2/tr2_verb.h                      |   24 +
>  transport-helper.c                     |    2 +
>  transport.c                            |    1 +
>  usage.c                                |   31 +
>  wt-status.c                            |   23 +-
>  66 files changed, 6353 insertions(+), 21 deletions(-)
>  create mode 100644 Documentation/technical/api-trace2.txt
>  create mode 100644 compat/win32/ancestry.c
>  create mode 100644 t/helper/test-trace2.c
>  create mode 100755 t/t0210-trace2-normal.sh
>  create mode 100644 t/t0210/scrub_normal.perl
>  create mode 100755 t/t0211-trace2-perf.sh
>  create mode 100644 t/t0211/scrub_perf.perl
>  create mode 100755 t/t0212-trace2-event.sh
>  create mode 100644 t/t0212/parse_events.perl
>  create mode 100644 trace2.c
>  create mode 100644 trace2.h
>  create mode 100644 trace2/tr2_cfg.c
>  create mode 100644 trace2/tr2_cfg.h
>  create mode 100644 trace2/tr2_dst.c
>  create mode 100644 trace2/tr2_dst.h
>  create mode 100644 trace2/tr2_sid.c
>  create mode 100644 trace2/tr2_sid.h
>  create mode 100644 trace2/tr2_tbuf.c
>  create mode 100644 trace2/tr2_tbuf.h
>  create mode 100644 trace2/tr2_tgt.h
>  create mode 100644 trace2/tr2_tgt_event.c
>  create mode 100644 trace2/tr2_tgt_normal.c
>  create mode 100644 trace2/tr2_tgt_perf.c
>  create mode 100644 trace2/tr2_tls.c
>  create mode 100644 trace2/tr2_tls.h
>  create mode 100644 trace2/tr2_verb.c
>  create mode 100644 trace2/tr2_verb.h
> 
> 
> base-commit: 77556354bb7ac50450e3b28999e3576969869068
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-108%2Fjeffhostetler%2Fcore-trace2-2019-v0-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-108/jeffhostetler/core-trace2-2019-v0-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/108
> -- 
> gitgitgadget

Several patches in this series have many style diffs as reported by
clang-format. Not all the diffs actually improve readability, but many
do. If you have clang-format installed, you can run:

git clang-format --style file --diff --extensions c,h ${commit}^ ${commit}

for each commit in the series to see what it thinks needs to be changed.


Other than that, I don't have any comments apart from what the other
reviewers have already mentioned.

Thanks for the series!

@@ -0,0 +1,1158 @@
Trace2 API
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 1/23/2019 3:51 PM, Junio C Hamano wrote:
> "Jeff Hostetler via GitGitGadget" <[email protected]> writes:
> 
>> +These high-level events are written to one or more Trace2 Targets
>> +in a target-specific format.  Each Trace2 Target defines a different
>> +purpose-specific view onto the event data stream.  In this mannor,
> 
> "In this manner"
> 
>> +a single set of Trace2 API event calls in the Git source can drive
>> +different types of analysis.
>> +
>> ...
>> +$ cat ~/log.perf
>> +12:28:42.620675 common-main.c:38                  | d0 | main                     | version      |     |           |           |            | 2.20.1.155.g426c96fcdb
>> +12:28:42.621001 common-main.c:39                  | d0 | main                     | start        |     |           |           |            | git version
>> +12:28:42.621111 git.c:432                         | d0 | main                     | cmd_verb     |     |           |           |            | version (version)
>> +12:28:42.621225 git.c:662                         | d0 | main                     | exit         |     |  0.001227 |           |            | code:0
>> +12:28:42.621259 trace2/tr2_tgt_perf.c:211         | d0 | main                     | atexit       |     |  0.001265 |           |            | code:0
>> +------------
>> ...
>> +
>> +trace2_def_param(...)
> 
> Not limited to this single one, but please either
> 
>   - omit "..." in parens, unless all of these functions take varargs
>     of unspecified type (which I do not think is the case), or
> 
>   - write a proper prototype for these functions, explain what the
>     function itself and what the parameters are for.
> 
> I'll complain about lack of info around here later X-<.

I documented the prototypes in trace2.h and was hoping to avoid
duplicating all that text here in this document.  The list of
functions here in this document was more of an overview of the
groups of concepts covered.  I'll revisit this.


> 
>> +trace2_def_repo(...)
> 
>> +----------------
>> +
>> +Git Child Process Events::
>> +
>> +	These are concerned with the various spawned child processes,
>> +	including sub-git processes and hooks,
>> ++
>> +----------------
>> +trace2_child_start(...)
>> +trace2_child_exit(...)
>> +
>> +trace2_exec(...)
>> +trace2_exec_result(...)
>> +----------------
>> +
>> +Git Thread Events::
>> +
>> +	These are concerned with Git thread usage.
>> ++
>> +----------------
>> +trace2_thread_start(...)
>> +trace2_thread_exit(...)
>> +----------------
> 
> Lack of _wait()/_join() feels a bit curious, but _exit() from what
> is being waited would suffice as a substitute.

The 2 trace2_thread_ functions have to be run by the thread-proc itself
rather than by the code calling the pthread functions.  So to me it made
more sense inside the thread-proc for them to be named _start and _exit.
This gives us an event with the elapsed time of the thread in isolation
using TLS data within each thread.

If I understand your question about _wait/_join, adding trace2 calls
near the pthread_create and pthread_join, would be a bit of a mess
because of the usual create loop and then later the join loop.  And
the exit times would be less accurate, since we typically _join them
in array order rather in the order they finish.

And having the trace2_thread_ calls in the pthread caller doesn't let
me access the thread's private TLS data.

Wrapping a region (like I show later in the preload_index() example)
around both of the pthread loops gives us the overall time for the
threading and insight into the thread overhead.  (One could even
have a region around the pthread_create loop and another around the
pthread_join loop, but that might not be worth the trouble.

> 
>> +Initialization::
>> +
>> +	Initialization happens in `main()`.  The initialization code
>> +	determines which, if any, Trace2 Targets should be enabled and
>> +	emits the `version`, `start`, and `exit` events.  It causes an
>> +	`atexit` function and `signal` handler to be registered that
>> +	will emit `atexit` and `signal` events.
>> ++
>> +----------------
>> +int main(int argc, const char **argv)
>> +{
>> +	int exit_code;
>> +
>> +	trace2_initialize();
>> +	trace2_cmd_start(argv);
> 
> Nobody other than trace2 integration would make a call to this
> helper, so it may not matter, but sending av alone without ac, even
> though ac is almost always redundant, feels somewhat unexpected.

Agreed.  There were other places that took an argv that didn't have
an argc on hand in the calling code.  So rather than fake up one for
them, I just omitted it from all the calls in my API.


> 
>> +Command Details::
>> +
>> +	After the basics are established, additional process
>> +	information can be sent to Trace2 as it is discovered, such as
>> +	the command verb, alias expansion, interesting config
>> +	settings, the repository worktree, error messages, and etc.
>> ++
>> +----------------
>> +int cmd_checkout(int argc, const char **argv)
>> +{
>> +	// emit a single "def_param" event
>> +	trace2_def_param("core.editor", "emacs");
> 
> Without knowing what "def_param event" is, this example is hard to
> fathom.  At this point in the doc, the reader does not even know
> what "def" stands for.  Is this call to define a param called
> core.editor?  Is it reporting that the default value for core.editor
> is emacs?
> 
>> +	// emit def_params for any config setting matching a pattern
>> +	// in GIT_TR2_CONFIG_PARAMS.
>> +	trace2_cmd_list_config();
> 
> As the reader does not know what def_param is, this is also hard to
> follow.

I'll rewrite that section.  The idea is to define a parameter that
a perf-tester might consider important.  For example, "core.abbrev"
used to cause massive perf problems on our mega repo where it might
take minutes to compute 7 or 8 digit abbreviations for a log or
branch command (completely dominating the time to actually compute
the log or branch list).


> 
>> +	trace2_cmd_verb("checkout");
>> +	trace2_cmd_subverb("branch");
> 
> These are guessable.  It probably reports what the codepath is
> doing.

These are to help post-processing.  The git command line is parsed
in 2 or 3 different phases and it takes quite a bit of work to get
to cmd_*() function (such as skipping over the -c, -C, --exec-path,
and other such args, and handling the commands where there isn't
a cmd_*() function (such as "git --exec-path")).  Then within the
cmd_*() function, a lot of work to see which variant of the command
is (such as is this checkout checking out a branch or a single
file).  And there are short and long forms of many arguments. And then 
there is the alias expansion which wants to rewrite the command line
(possibly recursively).  And then there is the code to try to run the
non-builtin dashed form.

For post-processing, you want to be able to do something like this:

	select elapsed_time where verb='checkout' and subverb='branch'
	select elapsed_time where verb='checkout' and subverb='path'

and report averages over those 2 sets independently without having
to recreate all that argv parsing inside the database query (or perl
script on the trace log).

> 
>> +	trace2_def_repo(the_repository);
> 
> Again, this is not easy to guess.  Is it reporting what the default
> repository is?

I'll reword this.


> 
>> +	if (do_something(...))
>> +	    trace2_cmd_error("Path '%s': cannot do something", path);
> 
> This is guessable, which is good.
> 
>> +void run_child(...)
>> +{
>> +	int child_exit_code;
>> +	struct child_process cmd = CHILD_PROCESS_INIT;
>> +	...
>> +
>> +	trace2_child_start(&cmd);
>> +	child_exit_code = ...();
>> +	trace2_child_exit(&cmd, child_exit_code);
> 
> Even though child_exit() has not been explained just like
> def_param(), this is far more guessable.  I wonder why it is.  The
> name of the variable passed to it in the example certainly helps, as
> it is totally unclear where the string constant "emacs" in the
> earlier example came from (e.g. is it hardcoded in the program?  is
> it merely illustrating "here is how you report the value you have
> decided to use for 'core.editor' variable"?).

yes, some of these are synthetic uses of the API to illustrate
the trace2 concepts.  I just picked 'core.editor' at random.  I'll
revisit the examples.


> 
>> ...
>> +$ cat ~/log.perf
>> +d0 | main                     | version      |     |           |           |            | 2.20.1.160.g5676107ecd.dirty
>> +d0 | main                     | start        |     |           |           |            | git status
>> +d0 | main                     | def_repo     | r1  |           |           |            | worktree:/Users/jeffhost/work/gfw
>> +d0 | main                     | cmd_verb     |     |           |           |            | status (status)
>> +...
>> +d0 | main                     | region_enter | r1  |  0.010988 |           | status     | label:worktrees
> 
> It is hard to guess what "d0" and "r1" stand for here.
> 
> In an earlier example we also saw an unexplained "d0" there, which I
> think was OK because the illustration was merely to give the "feel"
> of the format up there.  But now we are giving a bit more detailed
> explanation, we probably would want to at least briefly mention what
> each column means.

yes, the example at the top was just to preview the formats.
I'll add more info here.

> I'd stop here for now, as I am more interested in reading the code
> ;-)
> 

Thanks
Jeff

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 28, 2019

This patch series was integrated into pu via git@d28b107.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 28, 2019

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 1/22/2019 4:22 PM, Jeff Hostetler via GitGitGadget wrote:
> This patch series contains a greatly refactored version of my original
> Trace2 series [1] from August 2018.


My Trace2 series "jh/trace2" has a bad interaction with "js/vsts-ci"
causing some unit tests to fail in "pu".  I'll post a new version
shortly.

Jeff

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 28, 2019

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff Hostetler <[email protected]> writes:

> On 1/22/2019 4:22 PM, Jeff Hostetler via GitGitGadget wrote:
>> This patch series contains a greatly refactored version of my original
>> Trace2 series [1] from August 2018.
>
>
> My Trace2 series "jh/trace2" has a bad interaction with "js/vsts-ci"
> causing some unit tests to fail in "pu".  I'll post a new version
> shortly.

Thanks for a heads-up.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 28, 2019

This patch series was integrated into pu via git@6240162.

@jeffhostetler
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 28, 2019

Submitted as [email protected]

@jeffhostetler
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 22, 2019

Submitted as [email protected]

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 23, 2019

This patch series was integrated into pu via git@d524676.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 23, 2019

This patch series was integrated into next via git@4c84c62.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2019

This patch series was integrated into pu via git@27373ae.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2019

This patch series was integrated into pu via git@b00e53d.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 4, 2019

This patch series was integrated into pu via git@faf76e1.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 5, 2019

This patch series was integrated into pu via git@b0949e5.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 6, 2019

This patch series was integrated into pu via git@cece601.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2019

This patch series was integrated into pu via git@32038fe.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2019

This patch series was integrated into next via git@32038fe.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2019

This patch series was integrated into master via git@32038fe.

@gitgitgadget gitgitgadget bot added the master label Mar 7, 2019
@gitgitgadget gitgitgadget bot closed this Mar 7, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2019

Closed via 32038fe.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 17, 2019

On the Git mailing list, =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason wrote (reply to this):


On Fri, Feb 15 2019, Jeff Hostetler wrote:

> On 2/14/2019 7:33 AM, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
>>
>> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
>>
>>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
>>>
>>> There are no other outstanding comments that I'm aware of.
>>
>> Not a comment on this, just a follow-up question. I started looking into
>> whether this could be driven by config instead of getenv(). A lot easier
>> to set up in some cases than injecting env variables, especialy if the
>> log target supported a strftime() string, is any of that something
>> you've looked into already (so I don't do dupe work...).
>>
>> There's the chicken & egg problem with wanting to do traces way before
>> we get to reading config, so I expect that such a facility would need to
>> work by always trace record at the beginning until we get far enough to
>> write the config, and then either stop and throw away the buffer, or
>> write out the existing trace to the configured target, and continue.
>>
>
> Yes, I beat my head against the config settings for quite a while
> before settling on using an env var.
>
> I wanted to get the:
> () full process elapsed time,
> () the full original argv,
> () all of the command dispatch, run-dashed, and alias expansion,
> () and for my atexit() to run last.
> So I hooked into main() before the config is loaded.
>
> In most commands, the config is processed about the same time as
> parse_options() is called.  And we have to insert code in
> git_default_config() to load my settings.  This happens after all
> of the .git dir discovery, "-c" and "-C" processing, alias expansion,
> command dispatch and etc.  And the argv received in the cmd_*()
> function has been modified.  So we lose some information.  (I tried
> this for a while and didn't like the results.)
>
> I also tried using read_early_config() various places, but there
> were problems here too.  Too early and the "-c" settings weren't
> parsed yet.  And there was an issue about when .git dir was discovered,
> so local config settings weren't ready yet.
>
> I also recall having a problem when doing an early iteration with
> side effects (or rather the early iteration caused something to be
> set that caused the real iteration (in cmd_*()) to short-cut), but
> I don't remember the details.
>
> So we have a custom installer that also sets the environment variable
> after git is installed and haven't had any problems.
>
>
> I hesitate to say we should always start allocating a bunch of data
> and spinning up the TLS data and etc. before we know if tracing is
> wanted.  Just seems expensive for most users.
>
>
> I could see having a "~/.git_tr2_config" or something similar in
> some place like "/etc" that only contained the Trace2 settings.
> It would be safe to read very early inside main() and we would not
> consider it to be part of the real config.  For example, "git config"
> would not know about it.  Then you could enforce a system-wide
> setting without any of the env var issues.

I haven't written a patch for this, but it seems to me that we could
just start reading /etc/gitconfig via some custom config callback code
early as e.g. 58b284a2e91 ("worktree: add per-worktree config files",
2018-10-21) does for the worktree config.

It would ignore everything except trace.* or wherever namespace we'll
put this in, and "-c" etc. wouldn't work. We could just document that as
a limitation for now which could be fixed later.

It wouldn't make things worse, and would mean you could easily set
logging system-wide without needing to inject environment variables in
lots of custom (and sometimes hard-to-do) places, which I expect is the
most common use-case, and is the one I have.

> WRT the strftime() question, we could either add syntax to the
> env var value (or the tr2 config setting) to have some tokens
> for that.  I just stuck with absolute pathnames since I started
> by copying what was done for GIT_TRACE.
>
> Jeff

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 21, 2019

On the Git mailing list, Johannes Schindelin wrote (reply to this):

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-902831370-1553178617=:41
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi =C3=86var,

On Sun, 17 Mar 2019, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:

>
> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>
> > On 2/14/2019 7:33 AM, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
> >>
> >> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
> >>
> >>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
> >>>
> >>> There are no other outstanding comments that I'm aware of.
> >>
> >> Not a comment on this, just a follow-up question. I started looking i=
nto
> >> whether this could be driven by config instead of getenv(). A lot eas=
ier
> >> to set up in some cases than injecting env variables, especialy if th=
e
> >> log target supported a strftime() string, is any of that something
> >> you've looked into already (so I don't do dupe work...).
> >>
> >> There's the chicken & egg problem with wanting to do traces way befor=
e
> >> we get to reading config, so I expect that such a facility would need=
 to
> >> work by always trace record at the beginning until we get far enough =
to
> >> write the config, and then either stop and throw away the buffer, or
> >> write out the existing trace to the configured target, and continue.
> >>
> >
> > Yes, I beat my head against the config settings for quite a while
> > before settling on using an env var.
> >
> > I wanted to get the:
> > () full process elapsed time,
> > () the full original argv,
> > () all of the command dispatch, run-dashed, and alias expansion,
> > () and for my atexit() to run last.
> > So I hooked into main() before the config is loaded.
> >
> > In most commands, the config is processed about the same time as
> > parse_options() is called.  And we have to insert code in
> > git_default_config() to load my settings.  This happens after all
> > of the .git dir discovery, "-c" and "-C" processing, alias expansion,
> > command dispatch and etc.  And the argv received in the cmd_*()
> > function has been modified.  So we lose some information.  (I tried
> > this for a while and didn't like the results.)
> >
> > I also tried using read_early_config() various places, but there
> > were problems here too.  Too early and the "-c" settings weren't
> > parsed yet.  And there was an issue about when .git dir was discovered=
,
> > so local config settings weren't ready yet.
> >
> > I also recall having a problem when doing an early iteration with
> > side effects (or rather the early iteration caused something to be
> > set that caused the real iteration (in cmd_*()) to short-cut), but
> > I don't remember the details.
> >
> > So we have a custom installer that also sets the environment variable
> > after git is installed and haven't had any problems.
> >
> >
> > I hesitate to say we should always start allocating a bunch of data
> > and spinning up the TLS data and etc. before we know if tracing is
> > wanted.  Just seems expensive for most users.
> >
> >
> > I could see having a "~/.git_tr2_config" or something similar in
> > some place like "/etc" that only contained the Trace2 settings.
> > It would be safe to read very early inside main() and we would not
> > consider it to be part of the real config.  For example, "git config"
> > would not know about it.  Then you could enforce a system-wide
> > setting without any of the env var issues.
>
> I haven't written a patch for this, but it seems to me that we could
> just start reading /etc/gitconfig via some custom config callback code
> early as e.g. 58b284a2e91 ("worktree: add per-worktree config files",
> 2018-10-21) does for the worktree config.

Oy. Oy, oy, oy.

Maybe use `read_early_config()` instead? That would be *a lot* cleaner.
You could maybe use a9bcf6586d1a (alias: use the early config machinery to
expand aliases, 2017-06-14) as an inspiration.

> It would ignore everything except trace.* or wherever namespace we'll
> put this in, and "-c" etc. wouldn't work. We could just document that as
> a limitation for now which could be fixed later.
>
> It wouldn't make things worse, and would mean you could easily set
> logging system-wide without needing to inject environment variables in
> lots of custom (and sometimes hard-to-do) places, which I expect is the
> most common use-case, and is the one I have.

Yes, I agree, those are good goals to address.

Ciao,
Dscho

> > WRT the strftime() question, we could either add syntax to the
> > env var value (or the tr2 config setting) to have some tokens
> > for that.  I just stuck with absolute pathnames since I started
> > by copying what was done for GIT_TRACE.
> >
> > Jeff
>

--8323328-902831370-1553178617=:41--

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 21, 2019

On the Git mailing list, =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason wrote (reply to this):


On Thu, Mar 21 2019, Johannes Schindelin wrote:

> Hi =C3=86var,
>
> On Sun, 17 Mar 2019, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
>
>>
>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>
>> > On 2/14/2019 7:33 AM, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
>> >>
>> >> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
>> >>
>> >>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
>> >>>
>> >>> There are no other outstanding comments that I'm aware of.
>> >>
>> >> Not a comment on this, just a follow-up question. I started looking i=
nto
>> >> whether this could be driven by config instead of getenv(). A lot eas=
ier
>> >> to set up in some cases than injecting env variables, especialy if the
>> >> log target supported a strftime() string, is any of that something
>> >> you've looked into already (so I don't do dupe work...).
>> >>
>> >> There's the chicken & egg problem with wanting to do traces way before
>> >> we get to reading config, so I expect that such a facility would need=
 to
>> >> work by always trace record at the beginning until we get far enough =
to
>> >> write the config, and then either stop and throw away the buffer, or
>> >> write out the existing trace to the configured target, and continue.
>> >>
>> >
>> > Yes, I beat my head against the config settings for quite a while
>> > before settling on using an env var.
>> >
>> > I wanted to get the:
>> > () full process elapsed time,
>> > () the full original argv,
>> > () all of the command dispatch, run-dashed, and alias expansion,
>> > () and for my atexit() to run last.
>> > So I hooked into main() before the config is loaded.
>> >
>> > In most commands, the config is processed about the same time as
>> > parse_options() is called.  And we have to insert code in
>> > git_default_config() to load my settings.  This happens after all
>> > of the .git dir discovery, "-c" and "-C" processing, alias expansion,
>> > command dispatch and etc.  And the argv received in the cmd_*()
>> > function has been modified.  So we lose some information.  (I tried
>> > this for a while and didn't like the results.)
>> >
>> > I also tried using read_early_config() various places, but there
>> > were problems here too.  Too early and the "-c" settings weren't
>> > parsed yet.  And there was an issue about when .git dir was discovered,
>> > so local config settings weren't ready yet.
>> >
>> > I also recall having a problem when doing an early iteration with
>> > side effects (or rather the early iteration caused something to be
>> > set that caused the real iteration (in cmd_*()) to short-cut), but
>> > I don't remember the details.
>> >
>> > So we have a custom installer that also sets the environment variable
>> > after git is installed and haven't had any problems.
>> >
>> >
>> > I hesitate to say we should always start allocating a bunch of data
>> > and spinning up the TLS data and etc. before we know if tracing is
>> > wanted.  Just seems expensive for most users.
>> >
>> >
>> > I could see having a "~/.git_tr2_config" or something similar in
>> > some place like "/etc" that only contained the Trace2 settings.
>> > It would be safe to read very early inside main() and we would not
>> > consider it to be part of the real config.  For example, "git config"
>> > would not know about it.  Then you could enforce a system-wide
>> > setting without any of the env var issues.
>>
>> I haven't written a patch for this, but it seems to me that we could
>> just start reading /etc/gitconfig via some custom config callback code
>> early as e.g. 58b284a2e91 ("worktree: add per-worktree config files",
>> 2018-10-21) does for the worktree config.
>
> Oy. Oy, oy, oy.
>
> Maybe use `read_early_config()` instead? That would be *a lot* cleaner.
> You could maybe use a9bcf6586d1a (alias: use the early config machinery to
> expand aliases, 2017-06-14) as an inspiration.

Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
.git/config -> ~/.gitconfig etc. sequence just in terms of saving
critical time (this is the performance trace path, after all...).

But on a second reading I see that read_early_config() can do that if
you set config_source->file, opts->respect_includes etc. I.e. it just
(depending on options) resolves to git_config_from_file() which
58b284a2e91 used directly.

>> It would ignore everything except trace.* or wherever namespace we'll
>> put this in, and "-c" etc. wouldn't work. We could just document that as
>> a limitation for now which could be fixed later.
>>
>> It wouldn't make things worse, and would mean you could easily set
>> logging system-wide without needing to inject environment variables in
>> lots of custom (and sometimes hard-to-do) places, which I expect is the
>> most common use-case, and is the one I have.
>
> Yes, I agree, those are good goals to address.
>
> Ciao,
> Dscho
>
>> > WRT the strftime() question, we could either add syntax to the
>> > env var value (or the tr2 config setting) to have some tokens
>> > for that.  I just stuck with absolute pathnames since I started
>> > by copying what was done for GIT_TRACE.
>> >
>> > Jeff
>>

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2019

On the Git mailing list, Johannes Schindelin wrote (reply to this):

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-962455917-1553260658=:41
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi =C3=86var,

On Thu, 21 Mar 2019, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:

> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>
> > On Sun, 17 Mar 2019, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
> >
> >>
> >> On Fri, Feb 15 2019, Jeff Hostetler wrote:
> >>
> >> > I could see having a "~/.git_tr2_config" or something similar in
> >> > some place like "/etc" that only contained the Trace2 settings. It
> >> > would be safe to read very early inside main() and we would not
> >> > consider it to be part of the real config.  For example, "git
> >> > config" would not know about it.  Then you could enforce a
> >> > system-wide setting without any of the env var issues.
> >>
> >> I haven't written a patch for this, but it seems to me that we could
> >> just start reading /etc/gitconfig via some custom config callback
> >> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
> >> files", 2018-10-21) does for the worktree config.
> >
> > Oy. Oy, oy, oy.
> >
> > Maybe use `read_early_config()` instead? That would be *a lot*
> > cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
> > machinery to expand aliases, 2017-06-14) as an inspiration.
>
> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
> critical time (this is the performance trace path, after all...).
>
> But on a second reading I see that read_early_config() can do that if
> you set config_source->file, opts->respect_includes etc. I.e. it just
> (depending on options) resolves to git_config_from_file() which
> 58b284a2e91 used directly.

Sure, it can exclude the repo and user config, but would that not be
rather confusing?

Ciao,
Dscho

--8323328-962455917-1553260658=:41--

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2019

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 3/22/2019 9:17 AM, Johannes Schindelin wrote:
> Hi Ævar,
> 
> On Thu, 21 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
> 
>> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>>
>>> On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>>
>>>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>>>
>>>>> I could see having a "~/.git_tr2_config" or something similar in
>>>>> some place like "/etc" that only contained the Trace2 settings. It
>>>>> would be safe to read very early inside main() and we would not
>>>>> consider it to be part of the real config.  For example, "git
>>>>> config" would not know about it.  Then you could enforce a
>>>>> system-wide setting without any of the env var issues.
>>>>
>>>> I haven't written a patch for this, but it seems to me that we could
>>>> just start reading /etc/gitconfig via some custom config callback
>>>> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
>>>> files", 2018-10-21) does for the worktree config.
>>>
>>> Oy. Oy, oy, oy.
>>>
>>> Maybe use `read_early_config()` instead? That would be *a lot*
>>> cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
>>> machinery to expand aliases, 2017-06-14) as an inspiration.
>>
>> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
>> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
>> critical time (this is the performance trace path, after all...).
>>
>> But on a second reading I see that read_early_config() can do that if
>> you set config_source->file, opts->respect_includes etc. I.e. it just
>> (depending on options) resolves to git_config_from_file() which
>> 58b284a2e91 used directly.
> 
> Sure, it can exclude the repo and user config, but would that not be
> rather confusing?

This was hidden in my earlier message.

There's a lot a config machinery here with lots of chicken-n-egg
problems.  I want to know at the top of main() as quickly as possible
whether trace2 should be enabled.

I don't want to slow down git by spinning up a bunch of trace2 state
and wait until the git-dir is discovered, the "-c" args are processed,
and we dispatch into the builtin layer and the config is enumerated
to know if it should really be on or not.

I also didn't want to introduce another full iteration of the full
config system for startup performance reasons.

I played with read_early_config() at one point and it always seemed
to introduce side-effects.  Perhaps I was calling it earlier than it
was expecting and that triggered some of the git-dir discovery or
something. I don't remember all the details, I just remember that it
changed some behaviors in subtle ways.

Perhaps I could call something like git_config_from_file() with the
right set of magic bits to get it to parse exactly 1 system config
file that would contain my trace2 settings.  Hopefully, this will
not have any side-effects.

But if we lump them in with the main /etc/gitconfig settings, we
would have to explain that these trace2 config settings are
system-only and ARE NOT overridden by "-c", global, local, ...
config settings.  This would get confusing if the user tried to
set local values and did:
	git config --list --show-origin
and it showed system and local values but yet "stubbornly" refused
to use the local values over the system values. (I think this was
Johannes' point.)

That's why I was suggesting a system trace2 config file that is a
peer of /etc/gitconfig (maybe /etc/gittrace2) that would have these
values and not be expected to interact with the main config system.
That is, we just use the git_config_ routines to parse the file format,
rather than inventing another file format, but not change the
expectation of the established config value inheritance system.

If there's no objections, I'll take a look at doing this.

Thanks
Jeff


@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2019

On the Git mailing list, =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason wrote (reply to this):


On Fri, Mar 22 2019, Jeff Hostetler wrote:

> On 3/22/2019 9:17 AM, Johannes Schindelin wrote:
>> Hi =C3=86var,
>>
>> On Thu, 21 Mar 2019, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
>>
>>> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>>>
>>>> On Sun, 17 Mar 2019, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:
>>>>
>>>>>
>>>>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>>>>
>>>>>> I could see having a "~/.git_tr2_config" or something similar in
>>>>>> some place like "/etc" that only contained the Trace2 settings. It
>>>>>> would be safe to read very early inside main() and we would not
>>>>>> consider it to be part of the real config.  For example, "git
>>>>>> config" would not know about it.  Then you could enforce a
>>>>>> system-wide setting without any of the env var issues.
>>>>>
>>>>> I haven't written a patch for this, but it seems to me that we could
>>>>> just start reading /etc/gitconfig via some custom config callback
>>>>> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
>>>>> files", 2018-10-21) does for the worktree config.
>>>>
>>>> Oy. Oy, oy, oy.
>>>>
>>>> Maybe use `read_early_config()` instead? That would be *a lot*
>>>> cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
>>>> machinery to expand aliases, 2017-06-14) as an inspiration.
>>>
>>> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
>>> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
>>> critical time (this is the performance trace path, after all...).
>>>
>>> But on a second reading I see that read_early_config() can do that if
>>> you set config_source->file, opts->respect_includes etc. I.e. it just
>>> (depending on options) resolves to git_config_from_file() which
>>> 58b284a2e91 used directly.
>>
>> Sure, it can exclude the repo and user config, but would that not be
>> rather confusing?
>
> This was hidden in my earlier message.
>
> There's a lot a config machinery here with lots of chicken-n-egg
> problems.  I want to know at the top of main() as quickly as possible
> whether trace2 should be enabled.
>
> I don't want to slow down git by spinning up a bunch of trace2 state
> and wait until the git-dir is discovered, the "-c" args are processed,
> and we dispatch into the builtin layer and the config is enumerated
> to know if it should really be on or not.
>
> I also didn't want to introduce another full iteration of the full
> config system for startup performance reasons.
>
> I played with read_early_config() at one point and it always seemed
> to introduce side-effects.  Perhaps I was calling it earlier than it
> was expecting and that triggered some of the git-dir discovery or
> something. I don't remember all the details, I just remember that it
> changed some behaviors in subtle ways.
>
> Perhaps I could call something like git_config_from_file() with the
> right set of magic bits to get it to parse exactly 1 system config
> file that would contain my trace2 settings.  Hopefully, this will
> not have any side-effects.

Right, it also occurred to me that e.g. /home tends to be on NFS on some
systems, but very rarely /etc. I'd hate for trace reporting for git to
stall because NFS slows to a halt, so aside from temporary
implementation details (e.g. -c on the CLI not working) there's a good
case to be made for "read this from /etc/gitconfig only".

> But if we lump them in with the main /etc/gitconfig settings, we
> would have to explain that these trace2 config settings are
> system-only and ARE NOT overridden by "-c", global, local, ...
> config settings.  This would get confusing if the user tried to
> set local values and did:
> 	git config --list --show-origin
> and it showed system and local values but yet "stubbornly" refused
> to use the local values over the system values. (I think this was
> Johannes' point.)
>
> That's why I was suggesting a system trace2 config file that is a
> peer of /etc/gitconfig (maybe /etc/gittrace2) that would have these
> values and not be expected to interact with the main config system.
> That is, we just use the git_config_ routines to parse the file format,
> rather than inventing another file format, but not change the
> expectation of the established config value inheritance system.
>
> If there's no objections, I'll take a look at doing this.

I'd much rather just drop it in /etc/gitconfig with documented caveats
than introduce a new and permanent thing like /etc/gittrace2 due to a
current implementation detail.

Unlike something in core.* or whatever this facility is specialized
enough that I think it's fine to make it a bit of a special snowflake
given what it does and the target audience.

But even with those caveats it's still useful to see it in 'git config
-l --show-origin' for inspecting, and e.g. have it just work out of the
box with say the default puppetry for the likes of GitLab that now knows
how to set stuff in its /etc/gitconfig, but would need a special case
just for this.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2019

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 3/22/2019 10:53 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Mar 22 2019, Jeff Hostetler wrote:
> 
>> On 3/22/2019 9:17 AM, Johannes Schindelin wrote:
>>> Hi Ævar,
>>>
>>> On Thu, 21 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>>>>
>>>>> On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>>>
>>>>>>
>>>>>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>>>>>
>>>>>>> I could see having a "~/.git_tr2_config" or something similar in
>>>>>>> some place like "/etc" that only contained the Trace2 settings. It
>>>>>>> would be safe to read very early inside main() and we would not
>>>>>>> consider it to be part of the real config.  For example, "git
>>>>>>> config" would not know about it.  Then you could enforce a
>>>>>>> system-wide setting without any of the env var issues.
>>>>>>
>>>>>> I haven't written a patch for this, but it seems to me that we could
>>>>>> just start reading /etc/gitconfig via some custom config callback
>>>>>> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
>>>>>> files", 2018-10-21) does for the worktree config.
>>>>>
>>>>> Oy. Oy, oy, oy.
>>>>>
>>>>> Maybe use `read_early_config()` instead? That would be *a lot*
>>>>> cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
>>>>> machinery to expand aliases, 2017-06-14) as an inspiration.
>>>>
>>>> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
>>>> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
>>>> critical time (this is the performance trace path, after all...).
>>>>
>>>> But on a second reading I see that read_early_config() can do that if
>>>> you set config_source->file, opts->respect_includes etc. I.e. it just
>>>> (depending on options) resolves to git_config_from_file() which
>>>> 58b284a2e91 used directly.
>>>
>>> Sure, it can exclude the repo and user config, but would that not be
>>> rather confusing?
>>
>> This was hidden in my earlier message.
>>
>> There's a lot a config machinery here with lots of chicken-n-egg
>> problems.  I want to know at the top of main() as quickly as possible
>> whether trace2 should be enabled.
>>
>> I don't want to slow down git by spinning up a bunch of trace2 state
>> and wait until the git-dir is discovered, the "-c" args are processed,
>> and we dispatch into the builtin layer and the config is enumerated
>> to know if it should really be on or not.
>>
>> I also didn't want to introduce another full iteration of the full
>> config system for startup performance reasons.
>>
>> I played with read_early_config() at one point and it always seemed
>> to introduce side-effects.  Perhaps I was calling it earlier than it
>> was expecting and that triggered some of the git-dir discovery or
>> something. I don't remember all the details, I just remember that it
>> changed some behaviors in subtle ways.
>>
>> Perhaps I could call something like git_config_from_file() with the
>> right set of magic bits to get it to parse exactly 1 system config
>> file that would contain my trace2 settings.  Hopefully, this will
>> not have any side-effects.
> 
> Right, it also occurred to me that e.g. /home tends to be on NFS on some
> systems, but very rarely /etc. I'd hate for trace reporting for git to
> stall because NFS slows to a halt, so aside from temporary
> implementation details (e.g. -c on the CLI not working) there's a good
> case to be made for "read this from /etc/gitconfig only".
> 
>> But if we lump them in with the main /etc/gitconfig settings, we
>> would have to explain that these trace2 config settings are
>> system-only and ARE NOT overridden by "-c", global, local, ...
>> config settings.  This would get confusing if the user tried to
>> set local values and did:
>> 	git config --list --show-origin
>> and it showed system and local values but yet "stubbornly" refused
>> to use the local values over the system values. (I think this was
>> Johannes' point.)
>>
>> That's why I was suggesting a system trace2 config file that is a
>> peer of /etc/gitconfig (maybe /etc/gittrace2) that would have these
>> values and not be expected to interact with the main config system.
>> That is, we just use the git_config_ routines to parse the file format,
>> rather than inventing another file format, but not change the
>> expectation of the established config value inheritance system.
>>
>> If there's no objections, I'll take a look at doing this.
> 
> I'd much rather just drop it in /etc/gitconfig with documented caveats
> than introduce a new and permanent thing like /etc/gittrace2 due to a
> current implementation detail.
> 
> Unlike something in core.* or whatever this facility is specialized
> enough that I think it's fine to make it a bit of a special snowflake
> given what it does and the target audience.
> 
> But even with those caveats it's still useful to see it in 'git config
> -l --show-origin' for inspecting, and e.g. have it just work out of the
> box with say the default puppetry for the likes of GitLab that now knows
> how to set stuff in its /etc/gitconfig, but would need a special case
> just for this.
> 

ok, if we're comfortable with those caveats, i'll proceed as
you suggested. thanks for the quick response.

Jeff

die(_("unable to join load_index_extensions thread: %s"), strerror(ret));
} else {
p.src_offset = src_offset;
load_index_extensions(&p);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Fix a trivial bug that's been here since the shared/do_write_index
tracing was added in 42fee7a388 ("trace2:data: add trace2
instrumentation to index read/write", 2019-02-22). We should have
enter/leave points, not two enter/enter points. This resulted in an
"enter" event without a corresponding "leave" event.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
---
 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 61b043bac3..4fad4e3f9a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -3131,7 +3131,7 @@ static int write_shared_index(struct index_state *istate,
 	trace2_region_enter_printf("index", "shared/do_write_index",
 				   the_repository, "%s", (*temp)->filename.buf);
 	ret = do_write_index(si->base, *temp, 1);
-	trace2_region_enter_printf("index", "shared/do_write_index",
+	trace2_region_leave_printf("index", "shared/do_write_index",
 				   the_repository, "%s", (*temp)->filename.buf);
 
 	if (ret)
-- 
2.21.0.1020.gf2820cf01a

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 5/10/2019 9:37 AM, Ævar Arnfjörð Bjarmason wrote:
> Fix a trivial bug that's been here since the shared/do_write_index
> tracing was added in 42fee7a388 ("trace2:data: add trace2
> instrumentation to index read/write", 2019-02-22). We should have
> enter/leave points, not two enter/enter points. This resulted in an
> "enter" event without a corresponding "leave" event.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
> ---
>  read-cache.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 61b043bac3..4fad4e3f9a 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -3131,7 +3131,7 @@ static int write_shared_index(struct index_state *istate,
>  	trace2_region_enter_printf("index", "shared/do_write_index",
>  				   the_repository, "%s", (*temp)->filename.buf);
>  	ret = do_write_index(si->base, *temp, 1);
> -	trace2_region_enter_printf("index", "shared/do_write_index",
> +	trace2_region_leave_printf("index", "shared/do_write_index",
>  				   the_repository, "%s", (*temp)->filename.buf);
>  
>  	if (ret)
> 

Thanks! Obviously correct.

-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 5/10/2019 9:37 AM, Ævar Arnfjörð Bjarmason wrote:
> Fix a trivial bug that's been here since the shared/do_write_index
> tracing was added in 42fee7a388 ("trace2:data: add trace2
> instrumentation to index read/write", 2019-02-22). We should have
> enter/leave points, not two enter/enter points. This resulted in an
> "enter" event without a corresponding "leave" event.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
> ---
>   read-cache.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 61b043bac3..4fad4e3f9a 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -3131,7 +3131,7 @@ static int write_shared_index(struct index_state *istate,
>   	trace2_region_enter_printf("index", "shared/do_write_index",
>   				   the_repository, "%s", (*temp)->filename.buf);
>   	ret = do_write_index(si->base, *temp, 1);
> -	trace2_region_enter_printf("index", "shared/do_write_index",
> +	trace2_region_leave_printf("index", "shared/do_write_index",
>   				   the_repository, "%s", (*temp)->filename.buf);
>   
>   	if (ret)
> 

Good catch!

Thanks,
Jeff

LIB_OBJS += symlinks.o
LIB_OBJS += tag.o
LIB_OBJS += tempfile.o
LIB_OBJS += thread-utils.o
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, SZEDER Gábor wrote (reply to this):

On Fri, Feb 22, 2019 at 02:25:01PM -0800, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <[email protected]>
> 
> Create a new unified tracing facility for git.  The eventual intent is to
> replace the current trace_printf* and trace_performance* routines with a
> unified set of git_trace2* routines.
> 
> In addition to the usual printf-style API, trace2 provides higer-level
> event verbs with fixed-fields allowing structured data to be written.
> This makes post-processing and analysis easier for external tools.
> 
> Trace2 defines 3 output targets.  These are set using the environment
> variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT".  These may be
> set to "1" or to an absolute pathname (just like the current GIT_TRACE).
> 
> * GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
>   summary data.
> 
> * GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
>   It extends the output with columns for the command process, thread,
>   repo, absolute and relative elapsed times.  It reports events for
>   child process start/stop, thread start/stop, and per-thread function
>   nesting.
> 
> * GIT_TR2_EVENT is a new structured format. It writes event data as a
>   series of JSON records.

Please document these new environment variables in
'Documentation/git.txt', where the other environment variables,
including GIT_TRACE_*, are already documented.

While doing so, please note that the description about the possible
values of these variables and of GIT_TRACE above is incomplete,
because it's not just "1" or an absolute pathname.  Quoting the
description of GIT_TRACE:

  If this variable is set to "1", "2" or "true" (comparison is case
  insensitive), trace messages will be printed to stderr.
  
  If the variable is set to an integer value greater than 2 and lower
  than 10 (strictly) then Git will interpret this value as an open
  file descriptor and will try to write the trace messages into this
  file descriptor.

The way I see it in tr2_dst_get_trace_fd() below, this applies to
GIT_TRACE2* as well, and it even offers the possibility to specify a
Unix Domain Socket, too.


And sorry for barging in with a big bucket of paint this late, but,
really...  why GIT_TR2 instead of GIT_TRACE2?


> +int tr2_dst_get_trace_fd(struct tr2_dst *dst)
> +{
> +	const char *tgt_value;
> +
> +	/* don't open twice */
> +	if (dst->initialized)
> +		return dst->fd;
> +
> +	dst->initialized = 1;
> +
> +	tgt_value = getenv(dst->env_var_name);
> +
> +	if (!tgt_value || !strcmp(tgt_value, "") || !strcmp(tgt_value, "0") ||
> +	    !strcasecmp(tgt_value, "false")) {
> +		dst->fd = 0;
> +		return dst->fd;
> +	}
> +
> +	if (!strcmp(tgt_value, "1") || !strcasecmp(tgt_value, "true")) {
> +		dst->fd = STDERR_FILENO;
> +		return dst->fd;
> +	}
> +
> +	if (strlen(tgt_value) == 1 && isdigit(*tgt_value)) {
> +		dst->fd = atoi(tgt_value);
> +		return dst->fd;
> +	}
> +
> +	if (is_absolute_path(tgt_value))
> +		return tr2_dst_try_path(dst, tgt_value);
> +
> +#ifndef NO_UNIX_SOCKETS
> +	if (starts_with(tgt_value, PREFIX_AF_UNIX))
> +		return tr2_dst_try_unix_domain_socket(dst, tgt_value);
> +#endif
> +
> +	/* Always warn about malformed values. */
> +	tr2_dst_malformed_warning(dst, tgt_value);
> +	tr2_dst_trace_disable(dst);
> +	return 0;
> +}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 5/10/2019 1:28 PM, SZEDER Gábor wrote:
> On Fri, Feb 22, 2019 at 02:25:01PM -0800, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <[email protected]>
>>
>> * GIT_TR2_EVENT is a new structured format. It writes event data as a
>>   series of JSON records.
> 
> Please document these new environment variables in
> 'Documentation/git.txt', where the other environment variables,
> including GIT_TRACE_*, are already documented.

Thanks for the report!
 
> While doing so, please note that the description about the possible
> values of these variables and of GIT_TRACE above is incomplete,
> because it's not just "1" or an absolute pathname.  Quoting the
> description of GIT_TRACE:
> 
>   If this variable is set to "1", "2" or "true" (comparison is case
>   insensitive), trace messages will be printed to stderr.
>   
>   If the variable is set to an integer value greater than 2 and lower
>   than 10 (strictly) then Git will interpret this value as an open
>   file descriptor and will try to write the trace messages into this
>   file descriptor.
> 
> The way I see it in tr2_dst_get_trace_fd() below, this applies to
> GIT_TRACE2* as well, and it even offers the possibility to specify a
> Unix Domain Socket, too.


Please see [1] for a possible direction here.

[1] https://public-inbox.org/git/[email protected]/


> And sorry for barging in with a big bucket of paint this late, but,
> really...  why GIT_TR2 instead of GIT_TRACE2?

I did not do anything in response to this. I'll leave that discussion
to others.

Thanks,
-Stolee

@jeffhostetler jeffhostetler deleted the core-trace2-2019-v0 branch July 1, 2019 18:32
gitgitgadget bot pushed a commit that referenced this pull request Oct 21, 2024
Update clar from:

    - 1516124 (Merge pull request #97 from pks-t/pks-whitespace-fixes, 2024-08-15).

To:

    - 206accb (Merge pull request #108 from pks-t/pks-uclibc-without-wchar, 2024-10-21)

This update includes a bunch of fixes and improvements that we have
discussed in Git when initial support for clar was merged:

  - There is a ".editorconfig" file now.

  - Compatibility with Windows has been improved so that the clar
    compiles on this platform without an issue. This has been tested
    with Cygwin, MinGW and Microsoft Visual Studio.

  - clar now uses CMake. This does not impact us at all as we wire up
    the clar into our own build infrastructure anyway. This conversion
    was done such that we can easily run CI jobs against Windows.

  - Allocation failures are now checked for consistently.

  - We now define feature test macros in "clar.c", which fixes
    compilation on some platforms that didn't previously pull in
    non-standard functions like lstat(3p) or strdup(3p). This was
    reported by a user of OpenSUSE Leap.

  - We stop using `struct timezone`, which is undefined behaviour
    nowadays and results in a compilation error on some platforms.

  - We now use the combination of mktemp(3) and mkdir(3) on SunOS, same
    as we do on NonStop.

  - We now support uClibc without support for <wchar.h>.

The most important bits here are the improved platform compatibility
with Windows, OpenSUSE, SunOS and uClibc.

Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Taylor Blau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants