AWS4 Examples


Uploading files to an AWS4 S3 storage

In the following example we upload all files from the local directory /home/data to AWS4 S3 storage via HTTPS.

      [directory]
      /home/data

         [files]
         *

             [destination]

                  [recipient]
                  https://donald:secret@hollywood/bucketname/data;auth=aws4-hmac-sha256

                  [options]
                  trans_srename * %tY%tm%td/*

The only difference here is setting the auth= parameter at the end to aws4-hmac-sha256. Otherwise it would have tried with the default basic authentication. The bucketname is either in the hostname or the first element in the path.

Note that AWS4 S3 has a flat data structure and does not know about directories. So if in the above example data does not exist, it just creates a new filename (AWS language: key) with / as part of the name. So, here it is not necessary for AFD to perform any extra operation to create the directory.

Downloading files from AWS4 S3

Downloading files is very similar when you download via other protocols like FTP, SFTP an HTTP. But due to the nature of AWS having a flat structure, lets explain the extra options having a bucket test with the following content:

          test
           |
           +------- amdgpu.2
           +------- aws_authorization.c
           +------- README.configure
           |
           +------- test2
           |         |
           |         +------- 1X7
           |         |         |
           |         |         +------- rename.rule2
           |         |
           |         +------- afd.users.sample
           |
           +------- test3
                     |
                     +------- 1X8
                     |         |
                     |         +------- rename.rule
                     |
                     +------- afd.users.sample
                     +------- afd.users.sample2
                     +------- afd.users.sample3
                     +------- afd.users.sample4
                     +------- afd.users.sample5
                     +------- IMG_0843.JPG

Taking the above layout and we would just like to retrieve the 3 files in the root structure of bucket test, we can configure it as follows:

      [directory]
      https://user:secret@host/test;auth=aws4-hmac-sha256

         [dir options]
         store retrieve list
         do not remove
         time * * * * *

         [files]
         *

             [destination]

                  [recipient]
                  file://local//tmp

When AFD gets a listing of available files it would see the files as follows (from TRANS_DB_LOG with trace enabled):

      06 14:21:00  host [0]: eval_html_dir_list(): filename=README.configure length=16 mtime=1611483449 exact=3 size=22944 exact=1
      06 14:21:00  host [0]: eval_html_dir_list(): filename=amdgpu.2 length=8 mtime=1628350724 exact=3 size=2785 exact=1
      06 14:21:00  host [0]: eval_html_dir_list(): filename=aws_authorization.c length=19 mtime=1628349108 exact=3 size=8120 exact=1

If one adds the [dir options] no delimiter:

      [directory]
      https://user:secret@host/test;auth=aws4-hmac-sha256

         [dir options]
         no delimiter
         store retrieve list
         do not remove
         time * * * * *

         [files]
         *

             [destination]

                  [recipient]
                  file://local//tmp

One gets a listing of all files in the bucket test:

      06 14:23:00  host [0]: eval_html_dir_list(): filename=README.configure length=16 mtime=1611483449 exact=3 size=22944 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=amdgpu.2 length=8 mtime=1628350724 exact=3 size=2785 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=aws_authorization.c length=19 mtime=1628349108 exact=3 size=8120 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test2/1X7/rename.rule2 length=22 mtime=1630912523 exact=3 size=1805 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test2/afd.users.sample length=22 mtime=1628926069 exact=3 size=3736 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/1X8/rename.rule length=21 mtime=1630912441 exact=3 size=9985 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/IMG_0843.JPG length=18 mtime=1629042168 exact=3 size=1073143 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/afd.users.sample length=22 mtime=1628938088 exact=3 size=3736 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/afd.users.sample2 length=23 mtime=1628939026 exact=3 size=3736 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/afd.users.sample3 length=23 mtime=1628940807 exact=3 size=3736 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/afd.users.sample4 length=23 mtime=1628941351 exact=3 size=3736 exact=1
      06 14:23:00  host [0]: eval_html_dir_list(): filename=test3/afd.users.sample5 length=23 mtime=1628963020 exact=3 size=3736 exact=1

Since AFD cuts off the path by default you will notice that the file from test/test2/afd.users.sample will be overwritten by test/test3/afd.users.sample. To avoid this situation you can use the [dir options] keep path:

      [directory]
      https://user:secret@host/test;auth=aws4-hmac-sha256

         [dir options]
         no delimiter
         keep path
         store retrieve list
         do not remove
         time * * * * *

         [files]
         *

             [destination]

                  [recipient]
                  file://local//tmp

Now AFD will keep the path, however it will replace the delimiter / with \, since under Unix it is not possible to have a filename with / in it. The \ is also not nice, so it is recommended to use the srename option to rename it to something more useful.

Note that AFD uses the prefix option when it requests a listing from the AWS4 S3 service, which helps to reduce the listing result if you use the no delimiter option. The prefix option is added as soon as you specify a path after the bucketname:

      [directory]
      https://user:secret@host/test/test2;auth=aws4-hmac-sha256

         [dir options]
         no delimiter
         keep path
         store retrieve list
         do not remove
         time * * * * *

         [files]
         *

             [destination]

                  [recipient]
                  file://local//tmp

The listing of the above configuration will look as follows:

      06 10:22:00  host [0]: eval_html_dir_list(): filename=test2/1X7/rename.rule2 length=22 mtime=1630912523 exact=3 size=1805 exact=1
      06 10:22:00  host [0]: eval_html_dir_list(): filename=test2/afd.users.sample length=22 mtime=1628926069 exact=3 size=3736 exact=1

What also works, but is NOT recommended, is if you put the path under the [files] filter:

      [directory]
      https://user:secret@host/test;auth=aws4-hmac-sha256

         [dir options]
         no delimiter
         keep path
         store retrieve list
         do not remove
         time * * * * *

         [files]
         test2?*

             [destination]

                  [recipient]
                  file://local//tmp

Note the ? instead of the / as delimiter. If you would put a / there AFD would fetch the file test2/afd.users.sample, but will not distribute it further since AFD stores it as test2\afd.users.sample and the filter test/* would not match. test2?* does match. Because it is so tricky, this method is not recommended. Rather work without 'no delimiter' and 'keep path' and just do it via the path:

      [directory]
      https://user:secret@host/test/test2;auth=aws4-hmac-sha256

         [dir options]
         store retrieve list
         do not remove
         time * * * * *

         [files]
         *

             [destination]

                  [recipient]
                  file://local//tmp

This will just get you the file afd.users.sample. It makes the DIR_CONFIG larger but it is clearer and less error prone. You never know what other sub directories might pop up when your not the only one using the bucket.


Copyright © 2021 by H.Kiehl
Holger.Kiehl@dwd.de
Last updated: 07.09.2021
[red dot]Index [red dot]Home