MapGuide RFC 36 - Improve EnumerateResources API's performance
This page contains a change request (RFC) for the MapGuide Open Source project. More MapGuide RFCs can be found on the RFCs page.
Status
RFC Template Version | (1.0) |
Submission Date | September 16, 2007 |
Last Modified | Steve Dang Timestamp |
Author | Steve Dang |
RFC Status | implemented |
Implementation Status | completed |
Proposed Milestone | 2.0 |
Assigned PSC guide(s) | (when determined) |
Voting History | September 16, 2007 |
+1 | Tom, Jason, Bruce, Andy, Paul |
+0 | |
-0 | |
-1 | |
Abstained | Bob, Haris |
Overview
This RFC describes a new EnumerateResources API that can be made much faster than the existing EnumerateResources API when working with repositories that have a large number of resources.
Motivation
We are finding that when the existing EnumerateResources operation is executed on a repository that has a large number of resources, it is too slow, and may result in an Out Of Memory exception.
For example in this scenario: If the root folder has 1,000 folders in which each contains 1,000 documents, then for the request where resource is ”Library://” and depth is 1, the operation may take about 20 minutes to complete. This is unacceptable and makes client programs that need to list contents of the repository unusable.
Proposed Solution
The problem with the EnumerateResources API is that it needs to enumerate all of the child resources in a folder just to count them, even though they are not being returned. So for the above example, EnumerateResources will need to compute the number of children for 1,001 folders (the root folder plus 1,000 sub-folders), and to evaluate the user’s permissions on 1,001,001 resources (the root folder plus 1,000 sub-folders plus 1,000 X 1,000 documents), even though the user only gets information on the 1,000 top level folders in the result.
We propose a new EnumerateResources API that can be set to not return the number of resources in leaf level folders. For our example above, this means that the operation will only need to evaluate user’s permissions on 1,001 resources (the root folder plus 1,000 sub-folders). This will be much faster.
The new EnumerateResources API is described/documented as follows:
////////////////////////////////////////////////////////////////////////////////////////////////////// /// \brief /// Enumerates the resources in the specified repository. /// /// \remarks /// You can enumerate all types or just a selected type. You can /// also choose what depth in the repository to examine. /// This method only works on "Library" repository. /// If you specify a repository that is not supported, this method /// will throw an InvalidRepositoryType exception. /// /// <!-- Syntax in .Net, Java, and PHP --> /// \htmlinclude DotNetSyntaxTop.html /// MgByteReader EnumerateResources(MgResourceIdentifier resource, int depth, string type, bool computeChildren); /// \htmlinclude SyntaxBottom.html /// \htmlinclude JavaSyntaxTop.html /// MgByteReader EnumerateResources(MgResourceIdentifier resource, int depth, String type, boolean computeChildren); /// \htmlinclude SyntaxBottom.html /// \htmlinclude PHPSyntaxTop.html /// MgByteReader EnumerateResources(MgResourceIdentifier resource, int depth, string type, bool computeChildren); /// \htmlinclude SyntaxBottom.html /// /// \param resource (MgResourceIdentifier) /// Resource identifier specifying the resource to /// enumerate. This can be a document or a folder. /// \param depth (int) /// Recursion depth, relative to the specified resource. /// <ul> /// <li>If the resource is a document, depth must be set to /// 0. /// </li> /// <li>If the resource is a folder: /// <ul> /// <li>If the depth is equal to 0, only information about /// the specified folder is returned. /// </li> /// <li>If the depth is greater than 0, information about /// the folder and its descendants up to the specified /// depth are returned. /// </li> /// </ul> /// <li>If the depth is -1, information about the folder /// and all its descendants is returned. /// </li> /// </ul> /// \param type (String/string) /// Type of the resource to be enumerated. (Case /// sensitive.) See \link MgResourceType MgResourceType \endlink for valid /// types. If the type is a folder, you must include the /// trailing slash. /// \n /// Or, this can /// be set to null, in which case information about all /// resource types is returned. /// \param computeChildren (boolean/bool) /// Flag to determine whether or not the number of children of the folder /// resource at the specified depth should be computed. /// <ul> /// <li>If it is true, then the number of children of the folder /// resource at the specified depth will be set to a computed value (>= 0). /// </li> /// <li>If it is false, then the number of children of the folder /// resource at the specified depth will be set to -1. /// </li> /// </ul> /// /// \return /// Returns an MgByteReader object containing a description of /// the resources in XML format using the \link ResourceList_schema ResourceList \endlink /// schema. /// /// <!-- Example (PHP) --> /// \htmlinclude PHPExampleTop.html /// These examples assume that <c>$resourceService</c> has /// already been initialized. /// \code /// // Enumerates everything in the library /// $resourceID = new MgResourceIdentifier("Library://"); /// $byteReader = $resourceService->EnumerateResources($resourceID, -1, "", true); /// /// // Enumerates everything in Geography /// $resourceID = new MgResourceIdentifier("Library://Geography/"); /// $byteReader = $resourceService->EnumerateResources($resourceID, -1, "", true); /// /// // Enumerates all maps in the library /// $resourceID = new MgResourceIdentifier("Library://"); /// $byteReader = $resourceService->EnumerateResources($resourceID, -1, "MapDefinition", true); /// /// // Enumerates all folders in the library /// $resourceID = new MgResourceIdentifier("Library://"); /// $byteReader = $resourceService->EnumerateResources($resourceID, -1, "Folder", true); /// /// // Enumerates the folder Geography /// $resourceID = new MgResourceIdentifier("Library://Geography/"); /// $byteReader = $resourceService->EnumerateResources($resourceID, 0, "Folder", false); /// /// // Enumerates maps one level below Geography /// $resourceID = new MgResourceIdentifier("Library://Geography/"); /// $byteReader = $resourceService->EnumerateResources($resourceID, 1, "MapDefinition", false); /// /// // Enumerates a specific map /// // NOTE: In this case, depth must be set to 0 /// $resourceID = new MgResourceIdentifier("Library://Geography/World.MapDefinition"); /// $byteReader = $resourceService->EnumerateResources($resourceID, 0, "MapDefinition", false); /// \endcode /// \htmlinclude ExampleBottom.html /// /// \exception MgInvalidRepositoryTypeException /// \exception MgInvalidRepositoryNameException /// \exception MgInvalidResourcePathException /// \exception MgInvalidResourceNameException /// \exception MgInvalidResourceTypeException /// MgByteReader* EnumerateResources(MgResourceIdentifier* resource, INT32 depth, CREFSTRING type, bool computeChildren);
Note that the performance of the EnumerateResources API depends on the recursion depth and how effectively the tree hierarchy of resources is laid out, and that the client applications are encouraged to use the new API to take advantage of it.
Implications
- There is no compatibility issue associated with the new EnumerateResources API (i.e. no change in the version of the operation).
- Some indexing will be added to improve the performance of both old and new EnumerateResources APIs.
- The old EnumerateResources API will be depreciated in the future.
Test Plan
Tests need to be done to verify:
- For the old API, it should work as before (and faster due to indexing).
- For the new API, it should work the same as the old API and faster when the “computeChildren” flag is set to true and false respectively.
Funding/Resources
Autodesk will provide funding and resources.