Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File downloads caching #367

Open
silvae86 opened this issue Apr 20, 2018 · 0 comments
Open

File downloads caching #367

silvae86 opened this issue Apr 20, 2018 · 0 comments

Comments

@silvae86
Copy link
Member

silvae86 commented Apr 20, 2018

Dendro Version if known (or site URL)

v0.3

Please describe the expected behaviour

Dendro should not produce new files for download if there are no modifications done between the time of last production of a temporary file.

This will also allow the validation of a download (checking if error occurs when trying to download a file), before actually downloading the file, with minimal performance loss. If this is not implemented, the temporary files are produced twice; one time for the request that checks if the download "can be performed", another for the actual streaming of the data.

The call to the download method in window_controller.js would become this:

                    $(function ()
                    {
                        $.ajax({
                            url: url,
                            //timeout: 5000,
                            success: function ()
                            {
                                $("#" + hiddenIFrameID).attr("src", url);
                            },
                            error: function (err)
                            {
                                new PNotify({
                                    title: "Failed to download resource",
                                    text: "If you are using B2DROP, check credentials or file dont exist on storage",
                                    type: "error",
                                    opacity: 1.0,
                                    delay: 5000,
                                    addclass: "stack-bar-top",
                                    cornerclass: "",
                                    stack: stack_topright
                                });
                            }
                        });
                    });

Please describe the actual behaviour

Currently, temporary files are produced from MongoDB EVERY time a user requests a download.

Possible ways to fix the problem (programmers)

Implement proposed workflow (see attached pic).

First, implement a class, tempfiles_cache.js, which would connect to MongoDB and create a collection called "downloads_cache". Every document in this collection would have the following fields:

  • uri: the uri value of the resource (file or folder) that was cached for download
  • timestamp: the date when the temporary file was last produced for this resource
  • file_path: the path in the local filesystem where the temporary file resides

Second, need to create a new property in element.js, ddr:dateFSModified, which will have the date of last modification relevant in terms of file system. ddr:lastModified has the date of last modification in terms of metadata, but it makes no sense to have to refresh temporary files just because the metadata was updated. This means that there can be no caching for backups at this time, because the temporary backup needs to be updated taking into account both metadata and file system modification timestamps

Workflow

The workflow is as follows:

  1. A download of a file or folder is requested
  2. Verify in cache if there is a temporary file with that URI.
    2.1 If it is a folder, see if any of its children are "dirty" (ddr:dateFSModified > timestamp in cache ). This can be performed with a simple query on Virtuoso, with nie:isLogicalPart+ to check recursively for any child or grandchild, etc that is dirty
    2.1.1a If any child is dirty (or the folder itself), it is necessary to refresh the temporary zip file for that folder. Run the zipping code as usual.
    2.1.1b Update cache record with the new zip file's location in the local filesystem
    2.1.1c Serve file that is in updated cache record (file_path)
    2.1.2a If no children are dirty, serve the file (file_path) in the current cache record
    2.2. If it is a file, check if it is "dirty" (ddr:dateFSModified > timestamp in cache )
    2.1.1a If the file is dirty, produce new temporary file
    2.1.1b Update cache record with the temporary file's location in the local filesystem
    2.1.1c Serve file that is in updated cache record (file_path)
    2.1.2a If the file is not dirty, serve the file (file_path) in the current cache record

Keeping track of changes

It is necessary to keep track of the last modification of a file or a folder at several times:

  1. Whenever creating a folder, need to set the ddr:dateFSModified value to the current date
  2. Whenever uploading a new file, need to set the ddr:dateFSModified value to the current date
  3. Whenever cutting a resource, need to update its parent folder's ddr:dateFSModified to the current date
  4. Whenever renaming a resource, need to update its ddr:dateFSModified to the current date
  5. Whenever deleting a resource, need to update its ddr:dateFSModified to the current date
  6. Whenever uploading a file, need to update its parent folder's ddr:dateFSModified to the current date
  7. Whenever restoring a folder from backup, need to update its ddr:dateFSModified value to the current date

Workflow diagram (drawn quickly while over-caffeinated)

img_20180420_1537335

@silvae86 silvae86 assigned silvae86 and ghost Apr 20, 2018
@silvae86 silvae86 changed the title File downloads cache File downloads caching Apr 20, 2018
silvae86 added a commit that referenced this issue Apr 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants