1

I have been trying to print git clone progress on more minimalistic way for my project.

Aim

Instead of printing a whole git clone output on screen

remote: Enumerating objects: 1845678, done.        
remote: Counting objects: 100% (503/503), done.        
remote: Compressing objects: 100% (79/79), done.        
Receiving objects:  28% (54112/1845678), 10.10 MiB | 2.00 MiB/s

I want to abstract the lengthy lines of git output and just want to output the realtime progress of clone in the given below format

Cloning [$percentage]

What I have got so far

git clone --progress https://somerepo 2>&1 |tee gitclone.file | tr \\r \\n | total="$(awk '/Receiving objects/{print $3}')" | echo "$total"

Note: Since git clone only returns to stderr stream, I have redirected it to stdout stream. Even with the redirection I faced few issues, so I used progress option on git command.

I wanted to store output on the file (for debugging script) without disturbing stdout stream, so I used tee command. Since git clone returns \r instead of \n, I have replaced it to capture the output in proper manner. For more info on this part you can take a look at this question and its answer Git produces output in realtime to a file but I'm unable to echo it out in realtime directly in a while loop

Then I pick a line which has the keyword Receiving objects and print/store third keyfield value of that line.

What is my problem

My command is working fine if I am not storing output of awk and just printing it on screen:

git clone --progress https://somerepo 2>&1 |tee gitclone.file | tr \\r \\n | awk '/Receiving objects/{print $3}'

But, I am unable to store the awk output in a shell variable and echo it back:

git clone --progress https://somerepo 2>&1 |tee gitclone.file | tr \\r \\n | total="$(awk '/Receiving objects/{print $3}')" | echo "$total"

So what could be a possible solution for this issue?

3 Answers3

1

As the bash manual says:

Each command in a pipeline is executed as a separate process (i.e., in a subshell).

So, the output saved in the total variable is lost when the sub-shell exits. You can see this if you run this:

git clone --progress https://somerepo |& tee gitclone.file \
| tr \\r \\n | { total="$(awk '/Receiving objects/{print $3}')" ; \
 echo "$total" ; }

Since the variable total is lost after the above command line (i.e. pipe of commands) is finished, you should put the whole line into the "command substitution" parentheses like this:

total=$(git clone --progress https://somerepo |& tee gitclone.file | tr \\r \\n | awk '/Receiving objects/{print $3}')
echo "$total"

However, if you want the pipeline (starting with the git command) to be run in the background, then you have to redirect awk's output to a file and later read that file. For example:

tmpfile=$(mktemp)
git ... >"$tmpfile" &
# ...
# Do other stuff...
# ...
wait # for background process to complete.
total=$(cat "$tmpfile")
rm "$tmpfile"
echo "$total"

A hint: To redirect stdout and stderr of the git command to the tee command you can use the |& shorthand like this: git clone --progress https://somerepo |& tee gitclone.file | ...

FedKad
  • 9,212
  • 7
  • 40
  • 79
  • Hello, Thanks for time and efforts. But i am still not getting the output through total variable.Since i need to run perform this operation i have ran ur suggested command in background like below. total=$(git clone --progress https://somerepo 2>&1 |tee gitclone.file | tr \\r \\n | awk '/Receiving objects/{print $3}')& echo "$total" So is there any other way we i could try. – Eswar Reddy Jan 25 '23 at 09:46
  • @EswarReddy remove the `&` - if you send it to the background, the whole thing will be done in a subshell and the parent shell won't see changes in the variable. – muru Jan 25 '23 at 09:50
  • Thanks for the suggestion muru.I totally agree with you. But if i run this command on foreground i won't get terminal until the clone is completed right?. I need to do clone and print the progress on the same terminal. So can i pipe the echo command. like this total=$(git clone --progress https://somerepo 2>&1 |tee gitclone.file | tr \\r \\n | awk '/Receiving objects/{print $3}')|echo "$total – Eswar Reddy Jan 25 '23 at 09:59
  • 1
    Your requirement is not clear: You need to run the command in the background, but obtain the value in `total` ___when___? You cannot do this as long as the command is not finished. Please, [edit] your question and make it more clear. – FedKad Jan 25 '23 at 10:30
  • Hello Fed, I have editied my question aim part.Please havea look at it – Eswar Reddy Jan 25 '23 at 11:31
0

I think that the problem is with git's output. I does not complete new lines while rewriting the "Receiving objects:" line.

You can tell this is the case by looking at the output of

GIT_FLUSH=1 git clone --progress $repo 2>&1 | cat -bu

You will not see line numbers after the first occurance of the "Receiving" line. Here is an example where i pipe the output into "od" to make the \r and \n visible:

0000200                   \n                       4  \t   R   e   c   e
0000220    i   v   i   n   g       o   b   j   e   c   t   s   :        
0000240        0   %       (   1   /   1   1   0   3   8   )  \r   R   e
0000260    c   e   i   v   i   n   g       o   b   j   e   c   t   s   :
0000300                0   %       (   4   9   /   1   1   0   3   8   )
0000320    ,       8   .   8   8       M   i   B       |       2   .   8
0000340    4       M   i   B   /   s  \r 

A program that reads input line by line (like awk) will not see those lines until git is finished.

neuhaus
  • 123
  • 5
0

You've fundamentally got a pipeline buffering issue. The input and/or output buffers used by the programs in the pipeline are too big. Fortunately there is a way to tell each program in the pipeline to buffer only one line.

This is the program you need: https://manpages.ubuntu.com/manpages/bionic/man1/unbuffer.1.html.

It's installed by default in Ubuntu Desktop, I think, but if not:

sudo apt install expect

Then you can include the unbuffer command in your pipeline to solve the problem:

REPO_URL = https://something or git@something
unbuffer git clone --progress $REPO_URL 2>&1 | \
  unbuffer  -p tr \\r \\n | \
  { awk '/Receiving objects/{print $3}' ;  echo "$total" ; }

It prints 0%, 1%, ...100%, then because "total" is the last of those, prints 100% again, and it does so as the progress progresses, not all at the end or in large chunks.

  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 03 '23 at 11:01