<div dir="ltr"><div><div><div><div><div><div><div>Ahojte,<br><br></div>viem stiahnut velky subor z z githubu takto:<br><br><span style="font-family:monospace,monospace">file_content = requests.get(file_url, allow_redirects=True)<br>file_data = base64.b64decode(file_content.content)<br>open(output, 'wb').write(file_data)</span><br><br></div>Kedze to dlho trva, chcem tam implementovat progressbar a tu zacinaju moje problemy ;-). Nasiel som ze by malo fungovat nieco taketo:<br><br><span style="font-family:monospace,monospace">file_size
= 19335882 # toto viem vopred<br>req = requests.get(file_url, allow_redirects=True, stream=True)<br>block_size = 1024<br>num_bars = file_size / (block_size*2)<br>bar = Bar(f'Downloading {filename}', max=num_bars,<br> suffix='%(percent).1f%% - %(eta)ds')<br>bytes_transferred = 0<br>with open(output, "wb") as file:<br> for chunk in req.iter_content(chunk_size=block_size):<br> bytes_transferred+= len(chunk)<br> if chunk:<br> file.write(chunk)<br> bar.next()<br> bar.finish()<br></span></div><span style="font-family:monospace,monospace">print(bytes_transferred)</span><br><br></div>Moje problem: Velkost prenesenych dat nesedi s velkostou suboru (26640760 vs
19335882 t.j. progress bar nezobrazuje korektny progress) z dovodu, ze github namiesto suboru posiela subor zabaleny v json a encodovany v
base64.<br><br></div>Workaround by mohol byt, ze ak viem velkost finalneho suboru, pokusim sa vypocitat velkost json filu (req.headers.get('Content-Length') v tomto pripade na githube nefunguje :-( ). Z neho by som po stiahnuti do pamate extrahoval content, dekodoval ho a az potom ulozil... Otazkou je ci nie je inteligentnejsi sposob ako to urobit...<br><br></div>PS: moja testovacia url je: <a href="https://api.github.com/repos/tesseract-ocr/tessdata/git/blobs/b01dab8de8174496a0012bf85296943b3e7c81d7">https://api.github.com/repos/tesseract-ocr/tessdata/git/blobs/b01dab8de8174496a0012bf85296943b3e7c81d7</a><br><br><br></div>Zd.<br></div>