• sga@piefed.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    23 days ago

    try something in lines of

    wget -r -np -k -p "website to archive recursive download"  
    

    may work, but in case it does not, i would download the the page html, and then filter out all pdf links (some regex or grep magic), and then just give that list to wget or some other file downloader.

    if you can give the url, we can get a bit more specific.

    • tdTrX@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      22 days ago

      Website needs login.

      I downloaded some PDFs manually from F12 and they are password protected, how to unlock or get the password ?

      • sga@piefed.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        22 days ago

        in this case try to fetch a list and then fetch your cookies from browser, and use curl and scripting to fetch stuff.

        • sga@piefed.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          22 days ago

          for cookies, you can try to open devtools, and then go to network tab, and there find the pdf file, and then right click, and you will find an option something in lines of ‘copy as/for cURL’, copy that, and paste somewhere. repeat exercise for some other file. this should give you some pattern as for how to make a query. it most likely just needs a bearerauth/token in header cookie, or something alike that.