We upgraded directly from 5.4 to 5.6 with an in place upgrade using the "-unpack" method. We did the Author on a Saturday, and the Publishers on the Sunday, mostly because we ran into the Publisher permissions problem on Saturday night, and stayed on 5.4 publishers while trying to resolve.
Anonymous User Permissions on Publishers are more restrictive, pretty much just has access to /content and /etc on the publishers in 5.6. So any assets that may have been used from /apps even if not through the public interface should probably move under etc. We had XSL transformations that the component couldn't read the .xsl file under /apps. I think this also is the root cause of the ResourceResolver issues.
In 5.5 it looks like they switched from a HomeACLSetupService to what looks to be a default Jackrabbit supported default user home permissions. Without this fix, any newly created users won't have the default permissions to their home directory which yields errors. There is a KB article to help resolve this. We switched the whole block, note the UserManager class is also changed: http://helpx.adobe.com/cq/kb/avoid-new-user-permissions-issue.html.
There is a new Apache Sling Referrer Filter
OSGi configuration that will prevent any curl commands, we chose Allow Empty
, but I assume it is there as a quasi XSS filter.
The backups are different and the docs aren't crystal clear on how to automate this. We landed on after inspecting the network traffic from the gui:
curl -u admin:password -X POST --data "delay=0&force=true&installDir=/usr/wcm/author&target=/usr/wcm/backup/daily" http://localhost:4502/libs/granite/backup/content/admin/backups/
The Replication permissions have changed from a principle based (stored under /home/groups) method to a resource based method (stored beside under /content), which is more inline with how the other permissions are set in CQ. We didn't try the automated converter, but it exists here: http://helpx.adobe.com/cq/kb/replication-privileges-missing-after-upgrade-to-cq-5-5.html
We have an open ticket with Adobe regarding the ulimit in the new bin/start script. I can't tell where it is actually setting ulimit -n
which defaults to 1024 on RHEL. After about one week in 5.6 our Author died and spat out 22 gigs of error logs in a very short time period, about 160,000 lines per second. Tailing the log lead us to Max Open files error, but nothing Adobe has told us has resolved this issue. We manually added ulimit -n 8192
to our bin/start script because their check only looks at ulimit
, which is unlimited, not ulimit -n
the max open file descriptors that the previous serverctl actually sets.
We ended up making changes to how the ResourceResolver
was acquired. Many of our components, mostly related to running queries failed on the Publishers with the 5.6 upgrade. You may not end up having any instances of this, but for whatever reason we had dozens of places where we had gotten a ResourceResolver
from a resource
with resource.getResourceResolver()
. This looked something like:
ResourceResolver resolver = resource.getResourceResolver();
Iterator<Resource> resources = resolver.findResources("//query", Query.XPATH);
In JSPs, this was simple to resolve because resourceResolver is already defined and available through global.jsp. The above became just:
Iterator<Resource> resources = resourceResolver.findResources("//query", Query.XPATH);
In a Java class, this is more difficult, and looked very similar:
ResourceResolver resolver = resource.getResourceResolver();
Iterator<Resource> resources = resolver.findResources("//query", Query.XPATH);
And required passing into the java class the SlingHttpServletRequest (slingRequest fron JSP) object and became:
ResourceResolver resolver = request.getResourceResolver();
Iterator<Resource> resources = resolver.findResources("//query", Query.XPATH);
Our Checklist, derived from the following links and trial and error:
http://dev.day.com/docs/en/cq/current/deploying/upgrading.html http://dev.day.com/docs/en/cq/current/deploying/upgrading/tips-and-troubleshooting.html http://dev.day.com/docs/en/cq/current/deploying/upgrading/planning.html
Pre-reqs
- Terminate all the instances of running Workflows
- Clean up Workflow Archives if possible, this will be the longest part of the upgrade
- Run Consistency Check by enabling in Workspace and Repository XML and starting server: http://dev.day.com/docs/en/cq/current/deploying/upgrading/consistency-check.html
- Run garbage collection
- Run tar optimization
- TarPM index merge
- Make backups
- Disable Backups and any other jobs that are scheduled to run that can be
- 404 handler, Change binding to sbindings (its done in code)
- Stop Replication Agents
- Stop CQ
- Delete Logs
Upgrade to AEM 5.6
- Copy New AEM 5.6 Jar and rename it according to the NEW specs
- Unpack AEM (We used the unpack directions, not running the jar directly)
- Change Memory Allocation in bin/start (start script moved from 5.4 to 5.6 from crx-quickstart/server/serverctl to crx-quickstart/bin/start)
- Start the server
- Monitor upgrade.log (author takes much longer than publishmostly because of workflow conversion
- Restart the server after upgrade.log says completed
- Change the update_asset ffmpeg_renditions (Any workflows may be overrided in etc)
- Install rebuilt bundles for AEM 5.6
- Test component difference
- Compare the Bundles Json using the OSGi Diff Tool (http://experiencedelivers.adobe.com/cemblog/en/experiencedelivers/2012/11/how_to_sanity_checkosgibundlesacrossenvirnoments.html)
- Disable monitering scripts (https://helpx.adobe.com/cq/kb/CQ55MonitoringTooManyOpenFiles.html)
Thanks for the info Matthew, can you clarify the:
Change the update_asset ffmpeg_renditions (Any workflows may be overrided in etc)
step?
I am also wondering at which point you switched from Java 1.6 to 1.7 in the upgrade?