Concept - URL Rewriting (from 7.10.5)

Table of Contents


Product Version

7.10

Product To Version


Status

Introduction

This concept focuses on the URLRewriteHandler that consumes XML and is extensible utilizing the component framework. There are more handy features like the improved testability, live-reloading and availability of configurable/re-usable rules.

Historically seen, the URLs in Intershop have been fairly long ever since. They contain parameters to specify a locale, a currency, an application, sites or service types, like, for example:

http://www.example.com/INTERSHOP/web/WFS/PrimeTech-PrimeTechSpecials-Site/en_US/-/USD/ViewStandardCatalog-Browse?CategoryDomainName=PrimeTech-Computers&CategoryName=Notebooks (Standard Intershop URL)

Modern SEO concepts require to avoid URLs like this. Instead, they recommend the usage of short, structured and readable URLs. In order to fulfill these requirements, Intershop provides four concepts to handle URLs.

Glossary

Name

Description

URLRewriteHandler

Allows creating short, readable URLs in an Intershop system

Domain-Splitting

In addition to URL rewriting, you can pre-define default parameters for given host names. For example, the store at www.example.com can be told to use the currency USD and the localization en_US automatically. This way, URL rewriting rules can be much shorter as they contain less parameters. The definitions can be set in global domainsplittings.xml file as well.

 Short Links

Landing page URLs like http://www.example.com/summer-sale-2012/ are definable via the Commerce Management application. This feature allows one-way links directing to the page of interest.

References

URL Anatomy

URLs in Intershop start out with protocol, server name and, optionally, a port. The following parameters, "prefix" and "service type", are set using URL configurations.

http[s]://[server name]:[port]/[prefix]/[service type]/[server group]/[site id]/[locale]/[application id]/[currency]/[action]?[parameters]

Example:

https://www.example.com:443/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/de_DE/-/USD/ViewUserAccount-Start?targetPage=Homepage

History

A short overview of the URL Rewrite Handler evolution.

VersionNameDescriptionDocumentation
< 7.4RuleRewriteHandlerRegular expression rules and a host specific domainsplitting configured in properties filessee "URL Handling" documents
< 7.8RuleRewriteHandler + advanced DomainsplittingRegular expression rules and a host+shortpath specific domainsplitting configured in properties filessee "URL Handling" documents
7.8New RewriteRuleHandlerImplComplete rework supports custom rewrite-rule types and configuration in XMLsee "URL Rewriting" documents

Note

Beginning with ICM 7.8 the old regular expressions URLRewriteHandler is deprecated. For migration hints, see Cookbook - URL Rewriting.

Algorithm

The following diagram illustrates the compacting algorithm performed by the new URLRewriteHandler.


The expand process is displayed in the following:

Artifacts

Domain Splitting

Allow host-specific settings for server group, locale, currency, application and site. Via shortpathpattern it is also possible to extract this information from the path. The latter can be done, for example, by placing the locale inside each URL by using shortpathpattern=/${locale}${path}.

  • Compacting: The domain-splittings are matched from top to bottom (see XML file in the domainsplitting.xml section).
  • Expanding: The domain-splittings are matched from bottom to top (see XML file in the domainsplitting.xml section).

You have got only a single host and, therefore, you want the host name to be a wildcard? Simply specify no host for the relevant domain-splitting.

XML Definition

The XML configuration files of the domain splittings (named domainsplittings.xml) and the rewrite rules (named urlrewriterules.xml) must be placed in <IS_SHARE>system/config/cluster.

domainsplittings.xml

The schema of the domain splitting XML can be found in /staticfiles/definition/domainsplittings.xsd of cartridge bc_urlrewrite. The actual domain-splittings are defined in the file domainsplittings.xml, which is deployed to your application server under /share/system/config/cluster/domainsplittings.xml.

Example of domainsplittings.xml
<?xml version="1.0" encoding="utf-8"?> 
<domainsplittings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <domainsplitting name="example-fallback">
        <!-- matching -->
        <hosts>
            <host>localhost</host>
            <host>127.0.0.1</host>
            <host>www.example.com</host>
        </hosts>
        <shortpathpattern>${path}</shortpathpattern>

		<!-- parameters -->
        <site>inSPIRED-inSPIRED-Site</site>
        <server-group>WFS</server-group>
        <currency>EUR</currency>
        <appurlid>web</appurlid>
        <locale>en_US</locale>
 
		<!-- additional replacements for shortpathpattern parameters -->
		<replacements/>
    </domainsplitting>
    <domainsplitting name="example-DE">
        <!-- matching -->
        <hosts>
            <host>localhost</host>
            <host>127.0.0.1</host>
            <host>www.example.com</host>
        </hosts>
		<!-- Use the same replacement string (de and fr) in the compact element of the replacements -->
		<shortpathpattern>/${locale:(de|fr)}${path}</shortpathpattern>
		<!-- parameters -->
        <site>inSPIRED-inSPIRED-Site</site>
        <server-group>WFS</server-group>
        <currency>EUR</currency>
        <appurlid>web</appurlid>
 
		<!-- additional replacements for shortpathpattern parameters -->
        <replacements>
            <replacement type="locale">
                <compact>de</compact>
                <expand>de_DE</expand>
            </replacement>
            <replacement type="locale">
                <compact>fr</compact>
                <expand>fr_FR</expand>
            </replacement>
        </replacements>
    </domainsplitting>
</domainsplittings>

Note

If using replacement parameters (like (de|fr) for the locale in the example above), be sure to use the same string in the <compact> element of the replacements.

urlrewriterules.xml

The schema of the rule configuration XML can be found in /staticfiles/definition/urlrewriterules.xsd of cartridge bc_urlrewrite. The actual rewrite rules are defined in the urlrewriterules.xml. This file is deployed to your application server under /share/system/config/cluster/urlrewriterules.xml.

A rewrite rule is configured within urlrewriterules.xml as a rule element. The attributes type and priority of element rule are mandatory, whereas the attribute name is optional. A rule element can contain the optional child elements sites, appurlids, locales, currenciesserver-groups and configurations. When a rewrite rule is checked for applicability, the elements sites, appurlids, locales, currencies, and server-groups are linked via Boolean AND, whereas the child elements of these elements are linked via Boolean OR.

Example
<rule type="Homepage" priority="100"> 
   <sites> 
     <site>inSPIRED-inTRONICS-Site</site> 
     <site>inSPIRED-inTRONICS_Business-Site</site> 
   </sites> 
   <currencies> 
     <currency>USD</currency> 
     <currency>EUR</currency> 
   </currencies> 
   <configurations> 
     <configuration id="shortPath">/startpage</configuration> 
   </configurations> 
 </rule>

The rewrite rule above is applied if the current site is inSPIRED-inTRONICS-Site or inSPIRED-inTRONICS_Business-Site and the current currency is USD or EUR.

Example of urlrewriterules.xml
<?xml version="1.0" encoding="utf-8"?> 
<rules xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

	<!-- Homepage rule 
		accepts "", "/" or "<shortPath>"
		performs a redirect to "<shortPath>" 
		-->
	<rule name="Homepage rule" type="Homepage" priority="50">
		<!-- make rules "specific" -->
		<sites>
			<site>inSPIRED-inTRONICS-Site</site>
		</sites>
		<appurlids>
			<appurlid>-</appurlid>
		</appurlids>
		<locales>
			<locale>en_US</locale>
		</locales>
		<currencies>
			<currency>USD</currency>
		</currencies>
		<server-groups>
			<server-group>WFS</server-group>
			<server-group>BOS</server-group>
		</server-groups>
		<!-- configuration section -->
		<configurations>
			<configuration id="shortPath">/startpage_en</configuration>
		</configurations>
	</rule>

	<!-- Category Rule 
		/<category-path>/   (contains trailing slash!)
		-->  
	<rule name="Category Rule" type="Category" priority="40">
		<configurations>
			<configuration id="fullCategoryPath">true</configuration>
			<configuration id="localizedCategoryPath">false</configuration>
			<configuration id="excludedCharactersRule">[^a-zA-Z0-9]</configuration>
			<configuration id="replacementForExcludedCharacters">-</configuration>
		</configurations>
	</rule>

	<!-- Product Rule 
		/<category-path>/<product-name>-zid<sku>
		does redirect if product-name doesn't match
		-->  
	<rule name="Product Rule" type="Product" priority="600">
		<configurations>
			<configuration id="fullCategoryPath">true</configuration>
			<configuration id="localizedCategoryPath">false</configuration>
			<configuration id="excludedCharactersRule">[^a-zA-Z0-9]</configuration>
			<configuration id="replacementForExcludedCharacters">-</configuration>
		</configurations>
	</rule>

	<!-- pipeline rule for cart -->
	<rule name="ViewCart /cart" type="Pipeline" priority="800">
		<configurations>
			<configuration id="startNode">ViewCart-View</configuration>
			<configuration id="shortPath">/cart</configuration>
		</configurations>
	</rule>

    <!-- Sitemap products
    /sitemap-products/<category-path>
        -->
    <rule type="SitemapProducts" priority="100" name="sitemap products">
        <configurations>
            <configuration id="fullCategoryPath">true</configuration>
            <configuration id="localizedCategoryPath">true</configuration>
            <configuration id="shortPath">/sitemap-products</configuration>
        </configurations>
    </rule>    
    
	<!-- page rule for terms-and-conditions -->
	<rule name="Page /terms-and-conditions" type="Page" priority="780">
		<configurations>
			<configuration id="pageletId">systempage.termsAndConditions.pagelet2-Page</configuration>
			<configuration id="shortPath">/terms-and-conditions</configuration>
		</configurations>
	</rule>
 
	<!-- type "PageGeneric" creates /{PageletEntryPoint DisplayName}-cms-${p.PageletEntryPointID} -->
	<rule type="PageGeneric" priority="10" name="page fall-back">
		<configurations>
			<configuration id="uniquePrefix">-cms-</configuration>
			<configuration id="excludedCharactersRule">[^a-zA-Z0-9]</configuration>
			<configuration id="replacementForExcludedCharacters">-</configuration>
        </configurations>
	</rule>
    <!-- type "Search" creates /search/<search term value>/ (default) or /search?SearchTerm=<search term value>  -->
    <rule type="Search" priority="70" name="SimpleSearch"></rule>
</rules>

Structure

Rewrite Rules

Overview

There is an existing set of re-usable, configurable rewrite rules available over the configuration framework.

Be aware of the newly introduced configurations

  • excludedCharactersRule.
  • excludedCharacters,
  • replacementForExcludedCharacters

which might influence the appearance of the URL.

The following rules are affected:

  • CategoryRewriteRule,
  • ProductRewriteRule and
  • PageGenericRewriteRule.


NamePipelineParametersConfiguration ParametersDescription
HomepageViewHomepage-Start
shortPath

Define a starting page for a shop by specifying a shortPath (e.g., /start, / or "")

PipelineAny pipeline identified by configuration parameter "startNode"pipeline

startNode

shortPath

Pipeline node (e.g., startNode) is replaced with a given shortPath
PageViewContent-StartPageletEntryPointID

pageletId

shortPath

The "shortPath" is set when the pageletId matches.
PageGenericViewContent-StartPageletEntryPointID

uniquePrefix



excludedCharactersRule

excludedCharacters

replacementForExcludedCharacters

Creates CMS links automatically following the pattern

/<PageletEntryPoint:DisplayName><uniquePrefix><PageletEntryPointID>


Regular expression rule that defines characters to be replaced in the URL.

List of white space separated characters which are (additionally) replaced in the URL.

Replacement of excluded characters. If not defined a dash ("-") is used per default.

Product

ViewProduct-StartSKU

fullCategoryPath

localizedCategoryPath


excludedCharactersRule

excludedCharacters

replacementForExcludedCharacters


encodePath

Creates links with the following pattern /GoPro%20HERO4%20Silver%20Bundle-zidM8182790134362



Regular expression rule that defines characters to be replaced in the URL.

List of white space separated characters which are (additionally) replaced in the URL.

Replacement of excluded characters. If not defined a dash ("-") is used per default.

CategoryViewStandardCatalog-Browse

CatalogID

CategoryName

fullCategoryPath

localizedCategoryPath


excludedCharactersRule

excludedCharacters

replacementForExcludedCharacters


encodePath

All incoming parameters are removed.



Regular expression rule that defines characters to be replaced in the URL.

List of white space separated characters which are (additionally) replaced in the URL.

Replacement of excluded characters. If not defined a dash ("-") is used per default.

RegExparameters depend on select/selectMatch statement (see ${pipeline})

parameters depend on select/selectMatch statement (placeholders have the following form: ${p.<param-name>}

e.g. ${p.SearchTerm})

select

selectMatch

shortPath

shortPathMatch

longRequest

Compact Rule Definition

# select may contain any placeholder e.g., pipeline, site, servergroup, currency, appurlid, locale
select = ${pipeline}/${p.SearchTerm}/
selectMatch = ^ViewParametricSearch-Browse/.+/$
shortPath = /search/${p.SearchTerm}



// example for compacting
select = ${pipeline}/${p.SearchTerm}/
selectMatch = ^ViewParametricSearch-Browse/.+/$
shortPath = /search/${p.SearchTerm}

Expand Rule definition

# shortPathMatch should match the shortPath above
shortPathMatch = <regular expression>
# longpath may use placeholder $1..$99 referencing the regex-findings
longPath = <pipeline-node>?<parameter>
 
# examples for expanding
shortPathMatch = ^/search/(.+)$
longPath = ViewSearch-Browse?query=$1&tracking=true
-or-
shortPathMatch = ^/contact$
longPath = ViewContact-Start
SitemapProductsViewSitemap-ProductList

CatalogID

CategoryName

shortPath (e.g., "/sitemap-products")

fullCategoryPath

localizedCategoryPath

Creates a product sitemap with a prepended short path

/<shortPath>/<categoryPath>$

ViewDataViewData-Start-shortPath (e.g., "/viewdata", "/cart/de_DE")

A short path usually starting with a slash. This parameter is optional. If no short path is configured, the rule will use "/viewdata" as short path per default. This parameter must not end with a "/".

/<shortPath>


Search
ViewParametricSearch-SimpleOfferSearch
SearchTerm (=search form parameter)
shortPath = "/search"
shortPathMatch = "^/search/(.+)/$"
startNode = "ViewParametricSearch-SimpleOfferSearch"
searchTerm = "SearchTerm" 
searchTermLocation = "path"

Info

This rule is valid from Intershop 7.10.15.5

See description for details.

This rule handles the storefront search. It extracts the form parameter from the original request. The name of the form parameter can be changed by using the 'searchTerm' configuration.

PWA - Rule

SitemapProductPWA

SitemapCategoryPWA


ViewProductPWA-Start

ViewStandardCatalogPWA-Browse


slugifyPwaDefault = true

excludedCharacters = "( )"

since 7.10.40

   excludedCharacters = "( ) &"

Info

This rule is valid from Intershop 7.10.26

See description for details.

PWA - Rule

SitemapContentPagePWA

SitemapRangePWA


ViewContentPWA-Start

ViewSiteMapXMLforPWA-Start



Info

This rule is valid from Intershop 7.10.26

See description for details.

PWA3 - Rule

SitemapProductPWA3

SitemapCategoryPWA3


ViewProductPWA3-Start

ViewStandardCatalogPWA3-Browse


slugifyPwa3Default = true

excludedCharactersRule = [ &()=]

urlCharactersNotToEncode = /,'

example to configure a white list of characters:

excludedCharactersRule = [^a-zA-Z0-9äöüÄÖÜé]

Info

This rule is valid from Intershop 7.10.40

See description for details.

PWA3 - Rule

SitemapContentPagePWA3


ViewContentPWA3-Start




Info

This rule is valid from Intershop 7.10.40

See description for details.

Homepage

A simple rule to rewrite the homepage pipeline ViewHomepage-Start with e.g., "/", "/startpage".

Rule configuration parameter:

Name

Value

Description

shortPath

e.g., "/", "/startpage", ..

A short path usually starting with a slash.

Legacy Rule
rule.home.select = ${action} rule.home.selectMatch = ^ViewHomepage-Start$ rule.home.shortPath = / rule.home.shortPathMatch = ^([\\/]?)$ rule.home.longRequest = /${group}/${domain}/${locale}/${appurlid}/${currency}/ViewHomepage-Start

Pipeline

A pipeline rule maps a given pipeline startNode to a fixed shortPath. For instance, ViewSitemap-Start maps onto /sitemaps and vice-versa.

Rewrite Rule configuration parameter:

Name

Value

Description

startNode

"ViewSitemap-Start"

The related pipeline name startnode name

shortPath

"/sitemaps"

A short path usually starting with a slash.

RegEx

Classic regular expression based rules.

Name

Value

Description

Compact

select

e.g., "${action}/${p.parameter}"

Use parameters p.* or the predefined placeholders: ${action}, ${protocol}, ${locale}, ${currency}

selectMatch

e.g., "^ViewSitemap-Start/.+$"

Regular expression to match against the "select"

shortPath

e.g., "/sitemap${d.deletethisparameter}"

The compacted short path

Expand

shortPathMatch

e.g., "^/sitemap(.*)$"

Match incoming URL paths for this regular expression.

longRequest

e.g., "ViewSitemap-Start?Appendix=$1"

When matching, tell which pipeline is to open. Use $1,$2,.. to reference the wild card placeholders in shortPathMatch.

References to pipeline dictionary entries (e.g., #ProductBO:DisplayName#) are NOT SUPPORTED anymore. These hidden dependencies were hardly testable and caused confusion.

Page

Creates a link to a CMS page with pipeline start-node ViewContent-Start. Rewrites URLs that link to a page via pipeline start node "ViewContent-Start".

Compact example:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewContent-Start?PageletEntryPointID=systempage.termsAndConditions.pagelet2-Page
to
http://www.example.com/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/terms-and-conditions

Expand example:

http://www.example.com/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD /terms-and-conditions
to
http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewContent-Start?PageletEntryPointID=systempage.termsAndConditions.pagelet2-Page

Rule configuration parameters:

Name

Value

Description

pageletId

e.g., "systempage.termsAndConditions.pagelet2-Page"

The ID of the pagelet used as query parameter in the URL, e.g., http://.../ViewContent-Start? PageletEntryPointID=  systempage.termsAndConditions.pagelet2-Page.

shortPath

e.g., "/terms-and-conditions"

The short path of the page.

Page Generic

Rewrites URLs that link to a page via pipeline start node ViewContent-Start. It automatically creates a compacted URL for the given pipeline parameter PageletEntryPointID following the pattern:

PageGeneric compacted URL mapping
 /<PageletEntryPoint:DisplayName><uniquePrefix><PageletEntryPointID>

Compact example:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/de_DE/-/EUR/ViewContent-Start?PageletEntryPointID=page.checkout
to
http://www.example.com/WFS/inSPIRED-inTRONICS-Site/de_DE/-/EUR/Kasse-Seiten-cms-page.checkout

Expand example:

http://www.example.com/WFS/inSPIRED-inTRONICS-Site/de_DE/-/EUR /Kasse-Seiten-cms-page.checkout
to
http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/de_DE/-/EUR/ViewContent-Start?PageletEntryPointID=page.checkout


This rewrite rule will use the pagelet entry point's display name if it is defined for the given locale. Otherwise the rewrite rule will use the pagelet entry point's ID.

The configuration parameter uniquePrefix is used to separate the readable name and the pagelet ID.

Name

Value

Description

uniquePrefix

e.g., "-cms-"

A string used as a separator of the CMS pagelet entry point's name and its ID.

excludedCharactersRule

[^a-zA-Z0-9]

Regular expression rule that defines characters to be replaced in the URL. Per default every character that is NOT (a-z or A-Z or 0-9) is replaced with a dash (see config replacementForExcludedCharacters).
In order to support different character sets, e.g. Asian chars, remove this configuration excludedCharactersRule or adapt it to your needs.
For special characters use the section <![CDATA[< > &]]> inside the configuration tag.

excludedCharacters

Additionally it is possible to exclude single characters in the configuration excludedCharacters. Provide a list of characters separated by white space that should be replaced.
If an excludedCharactersRule is configured, this will be executed first, and the single characters afterwards.
For special characters use the section <![CDATA[< > &]]> inside the configuration tag.

replacementForExcludedCharacters-

If any excluded character is found, it will be replaced by the configured value in replacementForExcludedCharacters or by the default "-" (dash).

Product

This rewrite rule rewrites URLs for a product detail page with the pipeline start-node ViewProduct-Start.

The only context parameter needed for compacting is the SKU. The following legacy parameters CategoryName, CatalogID, CategoryID and CategoryDomainName are removed from compacted URLs.

The rewrite rule creates short paths following the pattern: /<ProductBO:Name>-zid<SKU>. The string -zid marks the beginning of the SKU.

Compact example:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewProduct-Start?SKU=M8182790134362
to
http://www.example.com/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/GoPro%20HERO4%20Silver%20Bundle-zidM8182790134362

The rule expands a URL by looking for SKU marker -zid within the URL. It then creates a URL path part of the form /ViewProduct-Start?SKU=<SKU>.

Expand example:

http://www.example.com/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/GoPro%20HERO4%20Silver%20Bundle-zidM8182790134362
to
http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewProduct-Start?SKU=M8182790134362

Rule configuration parameters:

NameValueDescription
fullCategoryPath true

The complete category path will be used in the rewritten URL:
/RootCatalog/SubCategory/SubSubCategory/SubSubSubCategory/

e.g., /computers/servers/fileservers-raidarrays/GoPro%20HERO4%20Silver%20Bundle-zidM8182790134362

false

Only CatalogID and the leaf category of the category path will be shown: /RootCatalog/SubSubSubCategory/

e.g., /Computers/fileservers-raidarrays/GoPro%20HERO4%20Silver%20Bundle -zidM8182790134362

Note

When localized (configuration parameter localizedCategoryPath = true), the localized names must be unique! Otherwise the correct category cannot be determined: Is it category with ID 4353 and name "fileservers-raidarrays" or category with ID 7675 and name "fileservers-raidarrays"?
localizedCategoryPath true

Localized DisplayNames are used, e.g.,

/Computers/servers/fileservers-raidarrays/GoPro%20HERO4%20Silver%20Bundle-zidM8182790134362

false

The plain category IDs and product SKU are used, e.g.

/Computers/523/949/922/GoPro%20HERO4%20Silver%20Bundle-zidM8182790134362

excludedCharactersRule

[^a-zA-Z0-9]

Regular expression rule that defines characters to be replaced in the URL. Per default every character that is NOT (a-z or A-Z or 0-9) is replaced with a dash (see config replacementForExcludedCharacters).
In order to support different character sets, e.g. Asian chars, remove this configuration "excludedCharactersRule" or adapt it to your needs.
For special characters use the section <![CDATA[< > &]]> inside the configuration tag.

excludedCharacters

Additionally it is possible to exclude single characters in the configuration excludedCharacters. Provide a list of characters separated by white space that should be replaced.
If an excludedCharactersRule is configured, this will be executed first, and the single characters afterwards.
For special characters use the section <![CDATA[< > &]]> inside the the configuration tag.

replacementForExcludedCharacters-

If any excluded character is found, it will be replaced by the configured value in replacementForExcludedCharacters or by the default "-" (dash).

encodePath


Set parameter encodePath to true in case non-Latin characters are used in the URL and excludedCharactersRule is not used. Else it might result in an invalid URL path.

E.g. Korean characters like 클래식-드럼세탁기 get converted into %e3%81%93%e3%82%8c%e3%81%ab%e3%82%82%e6%b3%a8%e7%9b%ae

Since the URL of a product also contains the category path, this configuration parameter enables encoding category and product texts in the URL, see Cookbook - URL Rewriting | Recipe: Non-Latin Characters Cause an Invalid URL.

Note

This concept is valid from ICM 7.10.36.

Category

Creates a link to a category. Rewrites URLs that link to a category with the pipeline start-node ViewStandardCatalog-Browse. URL input parameters for identifying the category are CategoryName and CatalogID.

Example for compacting/expanding:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewStandardCatalog-Browse?CatalogID=Computers&CategoryName=922
to
http://www.example.com/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD /Computers/922

Category URLs use a trailing slash "/" in short URLs as unique identifier.

The trailing slash "/" is used to determine quickly if we have a match in the category rule (with the trailing slash) or in the product rule (no trailing slash).

So the trailing slash is mostly there for performance reasons.

  • Otherwise the category rule has to do a lot of logic to determine if it is not matching,
  • or a product short URL might be a match in category rule (unlikely but possible).

Rule configuration parameters:

NameValueDescription
fullCategoryPath true

The complete category path will be used in the rewritten URL:
/RootCatalog/SubCategory/SubSubCategory/SubSubSubCategory/

e.g., /computers/servers/fileservers-raidarrays/

false

Only CatalogID and the leaf category of the category path will be shown: /RootCatalog/SubSubSubCategory/

e.g., /computers/fileservers-raidarrays/

Note

When localized (configuration parameter localizedCategoryPath = true), the localized names must be unique! Otherwise the correct category cannot be determined: Is it category with ID 4353 and name "fileservers-raidarrays" or category with ID 7675 and name "fileservers-raidarrays"?
localizedCategoryPath true

Localized DisplayNames are used, e.g.,

/computers/servers/fileservers-raidarrays/

false

The plain CategoryIDs are used, e.g.

/Computers/523/949/922/

excludedCharactersRule

[^a-zA-Z0-9]

Regular expression rule that defines characters to be replaced in the URL. Per default every character that is NOT (a-z or A-Z or 0-9) is replaced with a dash (see config replacementForExcludedCharacters).
In order to support different character sets, e.g. Asian chars, remove this configuration excludedCharactersRule or adapt it to your needs.
For special characters use the section <![CDATA[< > &]]> inside the configuration tag.

excludedCharacters

Additionally it is possible to exclude single characters in the configuration excludedCharacters. Provide a list of characters separated by white space that should be replaced.
If an excludedCharactersRule is configured, this will be executed first, and the single characters afterwards.
For special characters use the section <![CDATA[< > &]]> inside the configuration tag.

replacementForExcludedCharacters-

If any excluded character is found, it will be replaced by the configured value in replacementForExcludedCharacters or by the default "-" (dash).

encodePath

Set parameter encodePath to true in case non-Latin characters are used in the URL and excludedCharactersRule is not used. Else it might result in an invalid URL path.

E.g. Korean characters in a category display name like 클래식-드럼세탁기 get converted into %e3%81%93%e3%82%8c%e3%81%ab%e3%82%82%e6%b3%a8%e7%9b%ae

This configuration parameter encodes category texts in the URL. See also description for the configuration in product rewrite rule configurations.

Note

This rule configuration parameter is available from ICM 7.10.36.

SitemapProducts

Creates rewritten URLs for a sitemap referencing a category containing links to all products of a given catalog or category.

Rewrites URLs that link to a sitemap via pipeline start node "ViewSitemap-ProductList". URL input parameters are CategoryName and CatalogID.

Example:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewSitemap-ProductList?CatalogID=Computers&CategoryName=922
to
http://www.example.com/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/sitemap-products/Computers/922/

Category URLs use a trailing slash "/" in short URLs as unique identifier.

Rule configuration parameters:

NameValueDescription
fullCategoryPath true

The complete category path will be used in the rewritten URL:
/RootCatalog/SubCategory/SubSubCategory/SubSubSubCategory/

e.g., /computers/servers/fileservers-raidarrays/

false

Only CatalogID and the leaf category of the category path will be shown: /RootCatalog/SubSubSubCategory/

e.g., /computers/fileservers-raidarrays/

Note

When localized (configuration parameter localizedCategoryPath = true), the localized names must be unique! Otherwise the correct category cannot be determined: Is it category with ID 4353 and name "fileservers-raidarrays" or category with ID 7675 and name "fileservers-raidarrays"?
localizedCategoryPath true

Localized DisplayNames are used, e.g.,

/computers/servers/fileservers-raidarrays/

false

The plain CategoryIDs are used, e.g.

/Computers/523/949/922/

shortPath

e.g., "/sitemap-products"A short path usually starting with a slash

ViewData

Rewrites URLs which call pipeline ViewData-Start.

Example for compacting/expanding:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewData-Start/2035272761?JumpTarget=ViewCart-View 
to
https://www.example.com/WFS/inTRONICS/en_US/-/USD/viewdata/2035272761?JumpTarget=ViewCart-View

NameValueDescription

shortPath

Examples:

"/viewdata"
"/cart/de_DE"

A short path usually starting with a slash. This parameter is optional.
If no short path is configured, the rule will use "/viewdata" as short path per default.
This parameter must not end with a "/".

Search

Info

This concept is valid from Intershop 7.10.15.5.

Rewrites URLs which call pipeline ViewParametricSearch-SimpleOfferSearch.

Example for compacting/expanding:

http://www.example.com/INTERSHOP/web/WFS/inSPIRED-inTRONICS-Site/en_US/-/USD/ViewParametricSearch-SimpleOfferSearch?SearchTerm=Sony

to - if configuration searchTermLocation=path (=default)
https://www.example.com/WFS/inTRONICS/en_US/-/USD/search/Sony/

or to - if configuration searchTermLocation=parameter

https://www.example.com/WFS/inTRONICS/en_US/-/USD/search?SearchTerm=Sony

NameValueDescription
shortPath
"/search"

A short path usually starting with a slash. This parameter is optional. If no short path is configured, the rule will use /search as short path per default.

startNode
"ViewParametricSearch-SimpleOfferSearch"This parameter is optional. If no start node is configured, the rule will use ViewParametricSearch-SimpleOfferSearch as start node per default.
searchTerm
"SearchTerm"This parameter is optional. If no search term is configured, the rule will use SearchTerm as search term per default. This configuration is also used to get the entered search form parameter.
searchTermLocation
"path" or "parameter"This parameter is optional. If no search term location is configured, the rule will use path as search term location per default. The second option is parameter (just something not equal to path).
shortPathMatch
"^/search/(.+)/$"

This parameter is optional. If no short path match is configured, the rule will use ^/search/(.+)/$ as short path match per default. This parameter is only required for searchTermLocation=path.

Example SearchRewriteRule configuration with custom parameters instead of default values:

urlrewriterules.xml - custom search rule section
<rule type="Search" priority="70" name="SimpleSearch">
  <configurations>
    <configuration id="startNode">ViewParametricSearch-CustomSimpleOfferSearch</configuration>
    <configuration id="shortPath">/searchCustom</configuration>
    <configuration id="shortPathMatch">^/searchCustom/(.+)/$</configuration>
    <configuration id="searchTerm">SearchTermCustom</configuration>
    <configuration id="searchTermLocation">parameter</configuration>
    <!-- <configuration id="searchTermLocation">path</configuration> -->
  </configurations>
</rule>

Rewrite Rules for PWA

Info

This concept is valid from Intershop 7.10.26.

There are four new rewrite rules which create sitemap URLs. These URLs only work for the Intershop Progressive Web App (PWA), not for the inSPIRED Demo store! Thus, only the compact code is implemented, not the code for expanding URLs.

Note

In case the URLs from a customized PWA differ from the current PWA implementation, new rewrite rules might be required. The Cookbook - URL Rewriting | Recipe: Create a New Rewrite Rule explains how to write a customized rewrite rule.

There are a few optionalconfigurations which might compensate some PWA route modifications.

These rules are closely linked to Concept - XML Sitemaps | XML Sitemaps and Intershop PWA and its configurations in syndication-targets.properties (cluster configuration directory).

The pipeline names configured for sitemapPipeline and viewingPipeline: ViewSiteMapXMLforPWA-Start, ViewProductPWA-Start, ViewStandardCatalogPWA-Browse and ViewContentPWA-Start do not exist. They are simply used as a unique identifier for the rewrite rules described here in this chapter of this document.

For all PWA rewrite rules the protocol is fixed on https as configured in syndication-targets.properties:

intershop.syndication.target.Sitemaps-PWA.protocol=https

SitemapRangePWA

Description

This rule generates the links for the initial sitemap_pwa.xml file.

This file contains URLs to the location of zipped xml file(s) located in the ICM Shared-File-System (SFS), which contain the actual sitemap content for products, categories and pages.

The filename and its extension meet the following requirements:

  • syndication-id=sitemap_pwa
  • objectType=product
  • catalogcategory
  • staticpage and extension=.xml.gz

Example URLs

sitemap_pwa
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
	<sitemap>
		<loc>https://intershoppwa.azurewebsites.net/sitemap_pwa-product-0.xml.gz</loc>
		<lastmod>2020-10-16T18:20:15+02:00</lastmod>
	</sitemap>
	<sitemap>
		<loc>https://intershoppwa.azurewebsites.net/sitemap_pwa-catalogcategory-0.xml.gz</loc>
		<lastmod>2020-10-16T18:20:41+02:00</lastmod>
	</sitemap>
	<sitemap>
		<loc>https://intershoppwa.azurewebsites.net/sitemap_pwa-staticpage-0.xml.gz</loc>
		<lastmod>2020-10-16T18:20:41+02:00</lastmod>
	</sitemap>
</sitemapindex>

Default Rewrite Rule Configuration

This urlrewriterules.xml configuration gets the default URLs for the XML sitemap for PWA.

urlrewriterules.xml - default SitemapRangePWA rule section
<rule type="SitemapRangePWA" priority="100" name="sitemap range pwa links"></rule>

Optional Rewrite Rule Configurations

These optional configurations replace the default configurations:

urlrewriterules.xml - custom SitemapRangePWA rule section
<rule type="SitemapRangePWA" priority="100" name="sitemap range pwa links">
  <configurations>
    <!-- below are optional parameters to customize the default behavior of the SitemapRangePWA rewrite rule -->
    <configuration id="shortPath">/syndication-</configuration>
    <configuration id="syndicationID">sitemap_for_pwa</configuration>
    <configuration id="pwaHost">www.intershop.com</configuration>
  </configurations>
</rule>

This configuration would result in URLs like: <loc>https://www.intershop.com/syndication-sitemap_for_pwa-staticpage-0.xml.gz</loc>

Removal of Optional Rewrite Rule Configuration Parameter sitemapFileExtension - from ICM 7.10.38

sitemapFileExtension - since ICM 7.10.38

From ICM 7.10.38 the SiteMaps generation UI can be configured to create xml files instead of zip-archives containing xml files.

The optional configuration parameter sitemapFileExtension and the value noFileExtension have been removed, because it was never used.
Since the file extension now depends on the SiteMap compression configuration in the back office, it is not useful to change the extension in the rewrite rule configuration.

SitemapProductPWA

Description

This rule is based on the Category rewrite rule and inherits some of its configuration options (see Category rule details), because the biggest part of the sitemap product URLs for the PWA is the category where the product is assigned to.

It is used when the pipeline name is: ViewProductPWA-Start.

Default Rewrite Rule Configuration

This urlrewriterules.xml configuration gets the default product URLs for the XML sitemap for PWA.

Rule configuration parameters:

NameValueDescription
slugifyPwaDefault

true

The slugify method usually handles any characters in a string that are problematic for URLs. This may apply to any localized texts from categories and products used for the URL.
The default behavior for the slugify method is to change all characters to lower case, convert e.g., German umlauts, and remove apostrophes on characters used in e.g., French and Czech language.

The PWA has its own URL handling and, therefore, there is no need to adapt URLs. So the default for the PWA is to keep these characters unchanged in the resulting URLs.
The slugify method's behavior can be changed with a configuration in case it is needed.

excludedCharacters( )

Removes character '(' and ')' from the URLs so that they do not cause any problems.
The excludedCharacters configuration takes a space-separated list of characters to be removed.

( ) &amp;Since 7.10.40 - A fix to remove Ampersand '&' character from URLs path. Because they do not belong in the path part, they belong in the parameters part.

Example configuration for SitemapProductPWA:

urlrewriterules.xml - default SitemapProductPWA rule section
<rule type="SitemapProductPWA" priority="100" name="sitemap product pwa links">
	<configurations>
		<configuration id="slugifyPwaDefault">true</configuration>
		<configuration id="excludedCharacters">( ) &amp;</configuration>
	<!--	<configuration id="excludedCharacters">( )</configuration> before 7.10.40 release -->
	</configurations>
</rule>

Optional Rewrite Rule Configurations

See OptionalPWARuleConfigurations.

Example URLs

sitemap_product
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
	<loc>https://intershoppwa.azurewebsites.net/Digital-Cameras/Pentax-Optio-RZ10-sku5920586-catCameras-Camcorders.575</loc>
	<lastmod>2020-09-30T19:57:14+02:00</lastmod>
	<changefreq>weekly</changefreq>
	<priority>0.8</priority>
	<image:image>
		<image:loc>https://intershoppwa.azurewebsites.net:443/INTERSHOP/static/WFS/inSPIRED-inTRONICS-Site/rest/inSPIRED/en_US/L/5920586-7387.jpg</image:loc>
		<image:title>Pentax Optio RZ10</image:title>
		<image:caption>Pentax Optio RZ10</image:caption>
	</image:image>
	<image:image>
		<image:loc>https://intershoppwa.azurewebsites.net:443/INTERSHOP/static/WFS/inSPIRED-inTRONICS-Site/rest/inSPIRED/en_US/S/5920586-7387.jpg</image:loc>
		<image:title>Pentax Optio RZ10</image:title>
		<image:caption>Pentax Optio RZ10</image:caption>
	</image:image>
</url>
...
</urlset>

SitemapCategoryPWA

Description

This rule is based on the Category rewrite rule and inherits some of its configuration options (see Category rule details).

It is used when the pipeline name is: ViewStandardCatalogPWA-Browse.

Default Rewrite Rule Configuration

This urlrewriterules.xml configuration gets the default category URLs for the XML sitemap for PWA.

See the explanations of the rule configuration parameters slugifyPwaDefault and excludedCharacters in SitemapProductPWA | Default Rewrite Rule Configuration.

Example configuration for SitemapCategoryPWA:

urlrewriterules.xml - default SitemapCategoryPWA rule section
<rule type="SitemapCategoryPWA" priority="100" name="sitemap category pwa links">
	<configurations>
		<configuration id="slugifyPwaDefault">true</configuration>
		<configuration id="excludedCharacters">( ) &amp;</configuration>
	<!--	<configuration id="excludedCharacters">( )</configuration> before 7.10.40 release -->
	</configurations>
</rule>

Optional Rewrite Rule Configurations

See OptionalPWARuleConfigurations.

Example URLs

sitemap_category
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
	<loc>https://intershoppwa.azurewebsites.net/Backpacks,-Notebook-Bags-&amp;-Cases-catComputers.1835.3003</loc>
	<lastmod>2020-09-30T19:55:03+02:00</lastmod>
	<changefreq>monthly</changefreq>
	<priority>0.4</priority>
</url>
<url>
	<loc>https://intershoppwa.azurewebsites.net/Remote-Controls-catHome-Entertainment.1058.857</loc>
	<lastmod>2020-09-30T19:55:09+02:00</lastmod>
	<changefreq>monthly</changefreq>
	<priority>0.4</priority>
</url>
<url>
	<loc>https://intershoppwa.azurewebsites.net/Firewire-Cables-catComputers.106.830.1306</loc>
	<lastmod>2020-09-30T19:55:03+02:00</lastmod>
	<changefreq>monthly</changefreq>
	<priority>0.4</priority>
</url>
...
</urlset>

SitemapContentPagePWA

Description

This rule is based on the Page rewrite rule and inherits some of its configuration options, see Page rule details.

It is used when the pipeline name is: ViewContentPWA-Start.

Default Rewrite Rule Configuration

This urlrewriterules.xml configuration gets the static pages' URLs for the XML sitemap for PWA.

There are two ways to configure the SitemapContentPagePWA rule:

  • The default behavior simply places the pageletId found behind a default shortPath of '/page', e.g.: /page/page.helpdesk.faq
  • For an individual configuration use the pageletId found to set a configured unique shortPath. The host-name can be set as an optional configuration.
    • The mode changes if the pageletId configuration for a SitemapContentPagePWA rule is set or not.
    • The specific rule (privacy-policy) has to have a higher priority (105) than the common rule (100).
  • The optional parameter pwaHost can be used for local setups or debugging purposes.
    • It basically forces the host name to be the rule configured Host, instead of the one in the Application configuration for 'rest' applications.
urlrewriterules.xml - SitemapContentPagePWA rule section
<!-- individual configuration -->
<rule type="SitemapContentPagePWA" priority="105" name="sitemap content page pwa privacy-policy">
  <configurations>
	<configuration id="pageletId">systempage.privacyPolicy.pagelet2-Page</configuration>
	<configuration id="shortPath">/en/privacy-policy</configuration>
	<!-- <configuration id="pwaHost">www.intershop.com</configuration> -->
  </configurations>
</rule>
<!-- default -->
<rule type="SitemapContentPagePWA" priority="100" name="sitemap content page pwa links"></rule>

Example URLs

sitemap_static-page
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
	<loc>https://intershoppwa.azurewebsites.net/page/page.helpdesk.faq</loc>
	<lastmod>2020-09-30T20:00:09+02:00</lastmod>
	<changefreq>yearly</changefreq>
	<priority>0.3</priority>
</url>
<url>
	<loc>https://www.intershop.com/en/privacy-policy</loc>
	<lastmod>2020-10-05T12:34:00+02:00</lastmod>
	<changefreq>yearly</changefreq>
	<priority>0.3</priority>
</url>
<url>
	<loc>https://intershoppwa.azurewebsites.net/page/systempage.termsAndConditions.pagelet2-Page</loc>
	<lastmod>2020-10-05T12:34:00+02:00</lastmod>
	<changefreq>yearly</changefreq>
	<priority>0.3</priority>
</url>
...
</urlset>

Optional PWA Rule Configurations

Description

There are some configurations which work for all PWA rewrite rules. These optional configurations are described here.

  • The default pipeline name which has to match for the rule to be applied can be modified with configuration of startNode.
  • For all PWA rewrite rules the host-name is either extracted from the assigned application configured External-Base-URL, from the rewrite rule configuration pwaHost, or from intershop.WebServerSecureURL.
    • With the configuration pwaHost each rule can set its own host-name for its generated sitemap URLs.
    • Except for image URLs, they URLs always depend on intershop.WebServerSecureURL.

  • Individual control of the behavior of the slugify method. This method prevents URL issues, see SitemapProductPWA | Default Rewrite Rule Configuration.
    • For the PWA the default behavior is set with: slugifyPwaDefault=true which results in URLs as described in section SitemapProductPWA | Default Rewrite Rule Configuration.
    • In case some character usually handled by the slugify method causes trouble as part of the URL, a more selective configuration is possible:
      • slugifyPreventToLowerCase=true prevents the slugify method from converting all characters to lower case.

      • slugifyPreventReplaceUmlauts=true prevents the slugify method from converting German umlauts from ä to ae and ö to oe and so on.

      • slugifyPreventStripAccents=true prevents the slugify method from removing apostrophes from French or Czech characters.

      • slugifyPwaDefault=true is all of the above set to true, so the URLs contain upper case characters, German umlauts, and characters with apostrophe.
  • The configuration charactersToEncode was introduced to encode special characters in case they cause problems.
    • It allows to configure characters which need to be encoded (java.net.URLEncoder.encode(...)) - e.g: converts '-6GB-(2GB-x-3)-sku' to -6GB-%282GB-x-3%29-sku
    • The charactersToEncode configuration takes a list of characters to be encoded. No separation character! See example below.
  • The configuration categoryPathPrefix allows to replace the default -cat with a customized version - only valid for SitemapProductPWA and SitemapCategoryPWA rule
    • from: https://intershoppwa.azurewebsites.net/Remote-Controls-catHome-Entertainment.1058.857
    • to: https://intershoppwa.azurewebsites.net/Remote-Controls-category-Home-Entertainment.1058.857

Example - urlrewriterules.xml

urlrewrite.xml - configuration
<rule type="SitemapProductPWA" priority="105" name="sitemap product pwa links customized">
  ...
  <configurations>
	<configuration id="startNode">ViewProductPWACustom-Start</configuration>
	<configuration id="pwaHost">www.customized.intershop.pwa.azurewebsites.net:449</configuration>

    <!-- both 'slugifyPwaDefault' and 'slugifyPrevent...'do not make sense at the same time -->
    <!-- <configuration id="slugifyPwaDefault">true</configuration> -->
    <configuration id="slugifyPreventToLowerCase">true</configuration>
	<configuration id="slugifyPreventReplaceUmlauts">true</configuration>
	<configuration id="slugifyPreventStripAccents">true</configuration>

	<configuration id="charactersToEncode">( ) &amp</configuration>
	<configuration id="categoryPathPrefix">-category-</configuration>
    ...
  </configurations>
</rule>

Rewrite Rules for PWA3

Info

This concept is valid from Intershop 7.10.40.

There are three new rewrite rules which create sitemap URLs. These URLs only work for the Intershop Progressive Web App (PWA) version 3, not for the PWA before version 3, nor for the inSPIRED Demo store!

Since the release of the PWA3 its URLs have changed, which makes the 'old' PWA SiteMaps links incompatible. This rules described here in this chapters are designed to create PWA3 compatible URLs for the SiteMap.

Note

In case the URLs from a customized PWA3 differ from the current PWA3 implementation, new rewrite rules might be required. The Cookbook - URL Rewriting | Recipe: Create a New Rewrite Rule explains how to write a customized rewrite rule.

There are a few optionalconfigurations which might compensate some PWA route modifications.

These rules are closely linked to Concept - XML Sitemaps | XML Sitemaps and Intershop PWA and its configurations in syndication-targets.properties (cluster configuration directory).

The pipeline names configured for sitemapPipeline and viewingPipeline: ViewSiteMapXMLforPWA-Start, ViewProductPWA3-Start, ViewStandardCatalogPWA3-Browse and ViewContentPWA3-Start do not exist. They are simply used as a unique identifier for the rewrite rules described here in this chapter of this document.

For all PWA3 rewrite rules the protocol is fixed on https as configured in syndication-targets.properties:

intershop.syndication.target.Sitemaps-PWA3.protocol=https

SitemapRangePWA3

Description

This rule generates the links for the initial sitemap_pwa.xml file.

For this rule no new PWA3 version has been developed, because nothing changed here from PWA version before version 3 and the current PWA version 3.

For details you can goto the documentation for SiteRangePWA.

SitemapProductPWA3

Description

This rule is based on the Category rewrite rule and uses some of its configuration options (see Category rule details), because the first and the last part of the sitemap product URLs for the PWA3 is the category where the product is assigned to.

It is used when the pipeline name is: ViewProductPWA3-Start. Which is configured in the syndication-targets.properties file section Sitemaps-PWA3.

Default Rewrite Rule Configuration

This urlrewriterules.xml configuration gets the default product URLs for the XML sitemap for PWA.

Rule configuration parameters:

NameValueDescription
slugifyPwa3Defaulttrue                             

The slugify method usually handles any characters in a string that are problematic for URLs.

This may apply to any localized texts from categories and products used as part of the URL.
The default behavior for the slugify method is to change all characters to lower case, convert e.g., German umlauts, and remove apostrophes on characters used in e.g., French and Czech language.

The PWA3 has its own URL handling and, therefore, there are only minor changes to the URLs. So the default for the PWA3 is to keep these characters unchanged in the resulting URLs. Except to lower case is now applied.
The slugify method's behavior can be changed with that configuration in case it is needed.

excludedCharactersRule[ &\(\)=]

Removes characters: <space>, '&', '(', ')' and '=' from the URLs so that they do not cause any problems.
The excludedCharactersRule is a Regular Expression configuration which defines a list of characters to be removed.

It can contain a list of black listed characers or a list of white-lised characters.

[^a-zA-Z0-9äöüÄÖÜé]

The white-lised characters like: [^a-zA-Z0-9äöüÄÖÜé] are currently not used for this rule.

Since neither (black or white list) can be assumed to cover all posibilities, this switch might be used.

It can be modified to your needs.

urlCharactersNotToEncode/,'

For better SEO ranking the PWA(3) URLs contain localized texts.

This text contain characters sometimes problematic in a URL and will be encoded. Example: / becomes %2F when encoded.

To match the URLs used in the PWA3 by the SiteMap generator code, a few of those caracters must remain.

This coniguration makes that happen. So the '/' , ',' and ' ' ' will not be URL encoded.

Example configuration for SitemapProductPWA3:

urlrewriterules.xml - default SitemapProductPWA rule section
<rule type="SitemapProductPWA3" priority="100" name="sitemap product pwa links for PWA3">
	<configurations>
		<configuration id="slugifyPwa3Default">true</configuration>
		<configuration id="excludedCharactersRule">[ &\(\)=]</configuration>
    <!-- <configuration id="excludedCharactersRule">[^a-zA-Z0-9äöüÄÖÜé]</configuration> white listed characters example -->
		<configuration id="urlCharactersNotToEncode">/,'</configuration>
	</configurations>
</rule>

Optional Rewrite Rule Configurations

See Optional PWA Rule Configurations.

Example URLs

sitemap_product
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
	<loc>https://intershoppwa.azurewebsites.net/computer/datenspeicher/festk%C3%B6rperdrives/a-data-s511-240gb-prd9013198-ctgComputers.206.1563</loc>
	<lastmod>2020-09-30T19:57:14+02:00</lastmod>
	<changefreq>weekly</changefreq>
	<priority>0.8</priority>
	<image:image>
		<image:loc>https://intershoppwa.azurewebsites.net:443/INTERSHOP/static/WFS/inSPIRED-inTRONICS-Site/rest/inSPIRED/en_US/L/9013198-7387.jpg</image:loc>
		<image:title>A Data S511</image:title>
		<image:caption>A Data S511</image:caption>
	</image:image>
	<image:image>
		<image:loc>https://intershoppwa.azurewebsites.net:443/INTERSHOP/static/WFS/inSPIRED-inTRONICS-Site/rest/inSPIRED/en_US/S/9013198-7387.jpg</image:loc>
		<image:title>A Data S511</image:title>
		<image:caption>A Data S511</image:caption>
	</image:image>
</url>
...
</urlset>

SitemapCategoryPWA3

Description

This rule is based on the Category rewrite rule and inherits some of its configuration options (see Category rule details).

It is used when the pipeline name is: ViewStandardCatalogPWA3-Browse.

Default Rewrite Rule Configuration

This urlrewriterules.xml configuration gets the default category URLs for the XML sitemap for PWA.

See the explanations of the rule configuration parameters slugifyPwa3Default, excludedCharactersRule and urlCharactersNotToEncode in SitemapProductPWA3 | Default Rewrite Rule Configuration.

Example configuration for SitemapCategoryPWA3:

urlrewriterules.xml - default SitemapCategoryPWA rule section
<rule type="SitemapCategoryPWA3" priority="100" name="sitemap category pwa links for pwa3">
	<configurations>
		<configuration id="slugifyPwa3Default">true</configuration>
		<configuration id="excludedCharactersRule">[ &\(\)=]</configuration>
		<configuration id="urlCharactersNotToEncode">/,'</configuration>
	</configurations>
</rule>

Optional Rewrite Rule Configurations

See Optional PWA Rule Configurations.

Example URLs

sitemap_category
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
	<loc>https://intershoppwa.azurewebsites.net/computer/hardware-komponenten/geh%C3%A4use-komponenten/pc-k%C3%BChlventilatoren-ctgComputers.106.236.921</loc>
	<lastmod>2020-09-30T19:55:03+02:00</lastmod>
	<changefreq>monthly</changefreq>
	<priority>0.4</priority>
</url>
<url>
	<loc>https://intershoppwa.azurewebsites.net/computer/notebooks-und-pcs/backpacks,-notebook-bags-cases-ctgComputers.1835.3003</loc>
	<lastmod>2020-09-30T19:55:09+02:00</lastmod>
	<changefreq>monthly</changefreq>
	<priority>0.4</priority>
</url>
<url>
	<loc>https://intershoppwa.azurewebsites.net/konferenzausstattung/beamer-ctgpresentation-conferencing.data-projectors</loc>
	<lastmod>2020-09-30T19:55:03+02:00</lastmod>
	<changefreq>monthly</changefreq>
	<priority>0.4</priority>
</url>
...
</urlset>

SitemapContentPagePWA3

Description

This rule is based on the Page rewrite rule and inherits some of its configuration options, see Page rule details.

It is used when the pipeline name is: ViewContentPWA3-Start.

It is currently a copy of the SitemapContentPagePWA rule implementation, because the URLs have not changed from the previous PWA versions to PWA3.

Whitelisted Parameters

Description

Some marketing agencies (like price search engines) or malicious users infect Intershop URLs with unnecessary parameters. This may lead to a massive downgrade of the cache hit ratio for the page cache, to an increased page cache size and finally leads to bad application server performance due to unnecessary requests. Blacklisting for named bad parameters would not help in such situations as the parameters are changed dynamically. The Pipeline-Whitelist-Parameter feature makes the ICM more robust against such requests. This functionality is achieved by white-listing allowed parameters in invoked URLs. Unknown parameters will be removed from the URL before a page is looked up from the cache. White-listing uses either the declared input parameters of strict pipelines or alternately the parameters from the pipelinewhitelistparameters.properties.

Configuration

Since the feature is part of the URL rewrite expand functionality, it can be enabled in the share/system/config/cluster/urlrewiterules.xml. Each rule can set an optional parameter: restrictToWhitelistedParameters which enables the Whitelisted Parameters feature for a particular rewrite rule. The feature is disabled by default and in contrary to the old implementation it cannot be enabled globally.

For whitelisting parameters manually, check the configuration properties file under /share/system/config/cluster/pipelinewhitelistparameters.properties.

Redirect Status Code

Description

Redirecting URLs is sometimes necessary when an old shop system with its URLs gets replaced with a new one or a product got simply moved to a new category. In order to keep the SEO ranking for those URLs, redirecting the request to the new URL is a common approach. When a URL is redirected an appropriate redirect status code should be used in order to inform the requester about the redirect. The default redirect status code is 302 which means 'Moved temporarily', but there are several other status codes for redirects (3xx). Therefore, it is possible to define the status code of the redirecting links using the configuration redirectStatusCode.

Configuration

urlrewrite.xml - example configuration for a new rule type
...
<!-- type "PageGeneric" creates /{PageletEntryPoint DisplayName}-cms-${p.PageletEntryPointID} -->
    <rule type="PageGeneric" priority="10" name="page fall-back">
        <configurations>
            <configuration id="uniquePrefix">-cms-</configuration>
            <configuration id="excludedCharactersRule">[^a-zA-Z0-9]</configuration>
            <configuration id="replacementForExcludedCharacters">-</configuration>
            <configuration id="redirectStatusCode">301</configuration>
        </configurations>
    </rule>
...
    <rule type="LegacyRedirect" priority="100" name="lr-100">
        <configurations>
            <!-- replaces the default redirect status code of 302 with a 301 -->
            <configuration id="redirectStatusCode">301</configuration>
        </configurations>
    </rule>
 ...

Form Parameters

Info

This concept is valid from Intershop 7.10.15.5

Description

Rewrite rule implementations extend abstract class BaseRewriteRuleThis class contains two methods which allow to get the values of form parameters sent via a POST request.

These methods are called:

  • String getServletFormParameter(String <form-parameter-name>)
  • String[] getServletFormParameters(String <form-parameter-name>)

The first method getServletFormParameter simply reruns the first element of the array (String[]) retrieved by the second getServletFormParameters method.

The rewrite rule SearchRewriteRule uses this method.

  • The form post parameter is only available in the initial request.
  • If a redirect is required, the parameter should be stored as RewriteContext.QueryParameters entry.
  • This way it is available for the redirecting request as well.

Remove URL Parameter(s)

Info

This concept is valid from Intershop ICM 7.10.24.1

This configuration allows to remove one or more URL parameters from the compacted short URL, by adding the deleteShortPathParameter to its rule configuration in file urlrewrite.xml.

This applies to all currently available rewrite rules, except for the RegEx rule. The RegEx rule handles removing a URL parameter differently, by adding {d.parameterName} (see RegEx rule details).

Description

Sometimes it is required to remove one or more URL parameters which are not required in the link navigation logic. For SEO purposes it is nice to have those removed.

Always make sure that the URL parameters removed are not required for the pipeline logic!

If in doubt about a parameter, write your own rewrite rule with your own way to handle URL parameters (see Cookbook - URL Rewriting | Recipe: Create a New RewriteRule)

or use a RegEx rule, which is more flexible in its configuration to handle any kind of URL parameters (see RegEx rule details).

Configuration

  • There is now an easy way to remove an unnecessary URL parameter from the compacted short URL.
  • Configure the parameter deleteShortPathParameter with a comma seperated list of URL parameter names to be removed.
  • The parameter names are case sensitive!
  • Configured parameter names which could not be found are ignored.
urlrewrite.xml - example configuration to remove a url parameter
...
<rule type="ViewData" priority="10" name="Pipeline ViewData-Start">
    <configurations>
		<configuration id="deleteShortPathParameter">JumpTarget</configuration>
	</configurations>
</rule>
<rule type="Search" priority="70" name="SimpleSearch">
    <configurations>
		<configuration id="deleteShortPathParameter">firstUnwantedSearchParameterName,anotherUnwantedParameterName</configuration>
	</configurations>
</rule>
 ...

Individual Rules - App-, Site-, Locale-, Currency- and Server Group-Specific

For most rewrite rules their type, priority, name, and configuration is used. But there are also parameters which define whether a rule is applicable or not.

The rewrite rule parameters for app, site, locale, currency, and server group allow to define individual configurations that are applicable for a specific rule configuration of a rule type. They act like a filter. Only if there is a match, the rule is applied.

Make sure that the more specific rules have a higher priority (higher number), because the priority defines the order in which the rules are applied. Otherwise a more general rule might be applied when a more specific rule is wanted.

It is recommended to always have a general rule as a fallback, otherwise an invalid URL error might occur.

Note

Even if the rule configurations somehow belong to each other (specific vs. more general configuration for the same rule type), their configurations are NOT inherited. The configurations are always specific to the individual rule.

isApplicable Section in Rewrite Rule Configuration

For more information about the isApplicable method, see Cookbook - URL Rewriting | Create a New Rewrite Rule.

Example of urlrewriterules.xml - all available (rule is applicable) parameters
<!-- some rule -->
<rule type="some-available-type" priority="125" name="name of the rule for some-available-type">
  <locales>
    <locale>de_DE</locale>
    <locale>fr_FR</locale>
  </locales>
  <appurlids>
    <appurlid>appID1<appurlid>
    <appurlid>appID2<appurlid>
  </appurlids>
  <currencies>
    <currency>EUR</currency>
    <currency>GBP</currency>
  <currencies>
  <sites>
    <site>some-Site</site>
  <sites>
  <server-groups>
    <server-group>WFS<server-group>
  </server-groups>
  <configurations>
    ...
  </configurations>
</rule>

For examples, see the Cookbook - URL Rewriting.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.

Customer Support
Knowledge Base
Product Resources
Tickets