SlideShare a Scribd company logo
|0|
Architect
@simonech
Simone Chiaretta
Fast and furious(ly) multilingual:
Publishing of EU politics in 24
languages
Council of the European Union
General Secretariat
Directorate-General Administration
Directorate Communication and Information Systems
Unit Design & Development
Disclaimer: The views expressed are solely those of the speaker and may not be regarded as stating an official position of the Council of the EU
Clause de non-responsabilité: Les avis exprimés n'engagent que leur auteur et ne peuvent être considérés comme une position officielle du Conseil de l'UE
Umbraco Specialist
@netaddicts
Dirk De Grave
|1|
• One of the 3 main EU institutions together with
Commission and Parliament
• Made of two Councils
– Council of European Union
• meetings of ministries of each EU country
– European Council
• Head of states of each EU member state
• Rotating presidency (different Member State every 6
months)
• 28 Countries
• 24 Official Languages
Council of European Union
|2|
Consilium.Europa.EU
|3|
• Report on the work of the European Council, the Council
of the EU, the Eurogroup and their presidents to citizens
of the EU
• Inform the public and media with press releases and news
• List all meetings and meetings’ conclusion
Goal of the site
|4|
• Why we moved to Umbraco
• How we work
• How we deploy
• Scalability and system
• Our beautiful editor experience
• Standard based translation in 24 languages
• Integration with legacy and external systems
• Full-Text search
• Import from old CMS
Agenda
|5|
Why Umbraco
|6|
• pre-2011: Custom CMS
• 2011: Umbraco v4
• Jan 2015: Redesign + Commercial CMS
• 2017 Q3: Umbraco v7
Why we moved to Umbraco
|7|
• Independent study from CMS expert
• PoC done with multiple CMS
– Umbraco
– Drupal @ EC
– EPi Server
• Internal evaluation
Decision Process
|8|
• Faster editing and publishing process
• Simple editorial experience
• Better integration with translation tools
• Better search
• Able to handle a team of 30-ish editors
Expectations
|9|
• Umbraco is not multi-lingual by default
• Default translation flow is obsolete and weak
• Integration of legacy satellite applications
• CI and deployment
• Import from old CMS
Challenges
|10|
How we work
|11|
• Multi-discipline team: 2 analysts, 3 frontend
devs, 6 backend devs
• Scrum/sprint planning/daily standups
• Atlassian stack on premise as collaboration
tools for:
- Analysis (Confluence)
- Development (BitBucket)
- CI/CD (Bamboo)
- Sprint planning/issue tracking (Jira)
Development/team setup
|12|
Development/team setup
(Atlassian stack)
|13|
• Few options
– Umbraco as a Service #UaaS
– Shared database development
All developers use the same database for development
(doctypes/datatypes/…)
– Local database development
Each individual developer uses a local database,
either Sql server or Sql Server CE
Umbraco development setup
|14|
Decision ?
• Umbraco as a Service = Nay 
• Shared development = NO
– Perfect for a single environment (eg. DEV)
– People don’t need to sync any metadata nor content
– Candidate to a cluttered database if someone forgets to
delete any metadata or content that is not part of the
solution
• Local database development = YES
– People will need to sync all metadata and “relevant” content
– Perfect fit for proof of concept’ing (switching between Sql
server/Sql server CE)
– Perfect fit for continuous integration with multiple
environments if we can find a way to synchronize metadata
and content
Umbraco development setup
|15|
Lightweight, can be easily fine-tuned to only sync
minimal settings to get your environments in clean
state for both metadata as well as content and can
be automated
Challenges ?
- How to handle media efficiently?
- Dealing with exotic datatypes
- Long path names in continuous environment
uSync
|16|
.Core project (Business logic) / .Core.Tests
project
Startup configuration (DI = Unity, IoC, event handling)
PropertyValueConverters
ModelsBuilder
Model customizations
Controllers (Route hijacking all the things)
Services
Automapper
ViewModels
Project/solution setup
|17|
.Web project
Default Umbraco installation
Minimal changes allowed (.config) to smooth upgrades
App_Plugins for custom built and 3rd party packages (Nuget/Private
Nuget) even for packages from the online repository
.Frontend project
All things related to UI (views/js/css)
Frontend team uses their own workflow to generate assets which are
copied into the .Web project
.Resources project
Legacy dictionary
Project/solution setup
|18|
Workflow = GitFlow
Main develop branch
Each feature/bugfix = separate branch
PR with approval = Merge
Merge vs rebase!
Strict rules in implementing features
Features must be small
Changes unrelated to feature = rejected
Every feature is discussed upfront
Commits / commit messages / PR
messages must be very clear
|19|
Workflow = GitFlow
|20|
How we deploy
|21|
Build plan kicks in for every commit on feature/bugfix
branch pushed to remote repository
- Build must be successful
- All related tests must pass
- Continuous code quality assurance (SonarQube)
Build plan is only responsible for creating the
required artifacts
Build plan will never change anything
(files/configurations)
Build/Deployment pipeline (Bamboo)
|22|
Different build plans for DEV/TEST and STA/PROD
DEV/TEST = 1 single artifact
STA/PROD = 2 artifacts, 1x frontend and 1x backend
Build/Deployment pipeline (Bamboo)
|23|
Build/Deployment pipeline (Bamboo)
|24|
• Only if a build has been completed without
errors, it becomes candidate for “release”
• Release plan takes care moving the artifacts
from the build to your “destination”
environment
• Release plan is also responsible for
configuring the environment (web.config
transformations, uSync)
• Release can be automated (DEV) or is a
manual process (TEST/STA/PROD)
Build/Deployment pipeline (Bamboo)
|25|
Build/Deployment pipeline (Bamboo)
|26|
System architecture
|27|
• Back-office shielded from internet
• Instant publishing of content
• Performance and availability
Security and publishing
|28|
Systems and caching
SQL
SQL
UMBRACO CMS Production
Environment
Varnish cache servers Umbraco IIS web servers Windows File share cluster
HTTP
HTTP
HTTP
HTTP
HTTP
SQL
HTTP
SMB
SMB
Database cluster
Internet
SQL
SQL
Authoring/back office
HTTP/HTTPS
Alteon Load Balancer
|29|
• 3 level caching
1. ASP.NET and Umbraco caching
2. Varnish
3. CloudFlare (future)
Caching
|30|
• Reverse Proxy
• Caching based on HTTP Headers
• Behavior configurable with a DSL
• Possible to invalidate individual pages
Varnish
|31|
CloudFlare
|32|
Making editors happy
|33|
• In-page editing experience
• Find content easily (even with 1000’s of
nodes)
Main requirements
|34|
Predecessor cms editing experience
|35|
Predecessor cms editing experience
|36|
Grid / NestedContent /
DocTypeGridEditor / Customized Vorto
|37|
Grid editing
|38|
Grid template output
|39|
Grid template customization
|40|
Grid settings
|41|
Custom content picker (with preview)
|42|
Listview (visualsearch.js)
|43|
24 languages in a box
|44|
|45|
• 1-1 translation of 24 languages
• Batch management of languages
• Localize just the minimum need
• Export to industry standard XLIFF format
• Automatic import of translation
Requirements
|46|
• XML Localization Interchange File Format
• The only open standard bitext format
• OASIS standard since 2008
• Supported by all professional CAT tools in the
market
• Bitext is a file that contains both source and
target languages correctly « aligned »
What is XLIFF
|47|
Tyger Tyger, burning bright,
Tigre! Tigre! Divampante fulgore
In the forests of the night;
Nelle foreste della notte,
What immortal hand or eye,
Quale fu l'immortale mano o l'occhio
Could frame thy fearful symmetry?
Ch'ebbe la forza di formare la tua agghiacciante simmetria?
William Blake / Giuseppe Ungaretti
What is bitext
|48|
Tyger Tyger, burning bright, Tigre! Tigre! Divampante fulgore
In the forests of the night; Nelle foreste della notte,
What immortal hand or eye, Quale fu l'immortale mano o l'occhio
Could frame thy fearful symmetry? Ch'ebbe la forza di formare la tua
agghiacciante simmetria?
William Blake Giuseppe Ungaretti
What is bitext
|49|
msgid "Tyger Tyger, burning bright,"
msgstr "Tigre! Tigre! Divampante fulgore"
msgid "In the forests of the night;"
msgstr "Nelle foreste della notte,"
msgid "What immortal hand or eye,"
msgstr "Quale fu l'immortale mano o l'occhio"
msgid "Could frame thy fearful symmetry?"
msgstr "Ch'ebbe la forza di formare la tua agghiacciante simmetria?"
What is bitext
|50|
<source>Tyger Tyger, burning bright,</source>
<target>Tigre! Tigre! Divampante fulgore</target>
<source>In the forests of the night;</source>
<target>Nelle foreste della notte,</target>
<source>What immortal hand or eye,</source>
<target>Quale fu l'immortale mano o l'occhio</target>
<source>Could frame thy fearful symmetry?</source>
<target>Ch'ebbe la forza di formare la tua agghiacciante
simmetria?</target>
What is bitext
|51|
<source xml:lang="EN">Tyger Tyger, burning bright,</source>
<target xml:lang="IT">Tigre! Tigre! Divampante fulgore</target>
<source xml:lang="EN">In the forests of the night;</source>
<target xml:lang="IT">Nelle foreste della notte,</target>
<source xml:lang="EN">What immortal hand or eye,</source>
<target xml:lang="IT">Quale fu l'immortale mano o l'occhio</target>
<source xml:lang="EN">Could frame thy fearful symmetry?</source>
<target xml:lang="IT">Ch'ebbe la forza di formare la tua
agghiacciante simmetria?</target>
What is bitext
|52|
<unit id=1>
<segment>
<source xml:lang="EN">Tyger Tyger, burning bright,</source>
<target xml:lang="IT">Tigre! Tigre! Divampante fulgore</target>
</segment>
<segment>
<source xml:lang="EN">In the forests of the night;</source>
<target xml:lang="IT">Nelle foreste della notte,</target>
</segment>
<segment>
<source xml:lang="EN">What immortal hand or eye,</source>
<target xml:lang="IT">Quale fu l'immortale mano o l'occhio</target>
</segment>
<segment>
<source xml:lang="EN">Could frame thy fearful symmetry?</source>
<target xml:lang="IT">Ch'ebbe la forza di formare la tua
agghiacciante simmetria?</target>
</segment>
</unit>
What is bitext
|53|
• Linked trees
– PRO: default Umbraco approach to localization
– CONS: everything else 
• Nested nodes
– PRO: meaningful history, easier to manage programmatically
– CONS: not possible to sync grid structure between
languages, needs for custom batch publishing actions (and
much more)
• Vorto
– PRO: just localize what’s needed, just one node per content,
one grid structure for all
– CONS: loss of meaningful history, needs for custom
publishing “flag” per language, more difficult to manage
programmatically
3 options
|54|
• Vorto (customised)
• Custom “vorto-like” grid editor
• Custom translation component
Solution
|55|
Customised Vorto
|56|
Vorto in the grid
|57|
Custom translation flow (1)
|58|
Custom translation flow (1)
|59|
Custom translation flow (2)
|60|
Custom translation flow (3)
|61|
Extraction
|62|
Extraction
Umbraco
Generic document structure
Initial XLIFF (with HTML markup)
Split paragraphs and extract inline code
Segmentation
Apply Translation Memory
Off to Translation Workflow (SDL Studio)
Enrich with custom extensions
|63|
• Complete the system 
• Make the generic Extraction/Merging library
OpenSource
• Integrate the Umbraco specific
extaction/merging into Umbraco Core
https://github.com/simonech/XliffLib
Future steps
|64|
Integrations
|65|
List of internal/external system to
interact with
• PoolParty to enrich your content with valuable
metadata (taxonomies)
• Oracle database (MPO Meetings/Meeting planner)
• (TV)Newsroom
– Video API
– Asset/image library
• Rss feeds/twitter feeds
|66|
PoolParty
|67|
PoolParty
|68|
PoolParty
Why?
- Exchange taxonomy between different units within the EU
Council, or even more… with the outside world and vice versa
Example:
A “Location” taxonomy may already exist “somewhere”, so we
should be able to transparently reference this taxonomy without the
need to create a new one
Europe > Belgium > Brussels capital region > Brussels > …
|69|
PoolParty
Automated tagging ?
• Automated content tagging is possible using a 3rd party solution “Powertagging
with Umbraco and PoolParty”
• Didn’t really fit our requirements (legacy data, “taxonomy” currently not
semantically normalize)
Solution ?
On demand synchronization from PoolParty to Umbraco
• Limited number of syncs (~1/month)
• One way sync from PoolParty -> Umbraco
• Don’t rely on server availability
|70|
PoolParty(Sync process)
|71|
PoolParty (Sync’ed data)
|72|
PoolParty
Challenges ?
• Enrich our sync’ed data with custom “attributes”
Examples:
• Set country flag for specific “location” taxonomies
• Change default descriptions of a taxonomy on the frontend website
– “Council of the European Union” -> “Council of the EU”
Solution ?
• Create a “developer centric” “taxonomy settings section” to create a link
between the sync’ed taxonomy and our custom metadata
|73|
PoolParty (Taxonomy settings)
|74|
Meeting planner data (Oracle db)
External tools used by other departments creating “Meeting”s
Challenges:
• Data stored in external database
• Data is only exposed through readonly views on the Oracle db
• ~3000 meetings currently in system and available online, about ~100 meetings
are created monthly
• Approx. 4 meetings/month need additional content editing before publishing
• Link with the existing sync’ed taxonomy
• Advanced search (date/taxonomy/…)
Do we import this data in Umbraco ?
|75|
Meeting planner data (Oracle db)
|76|
Meeting planner data (Oracle db)
Decisions:
• Don’t import any meeting data in Umbraco (you’ve got everything you need
already)
• Remove connection to Oracle db
• Pushing data from Oracle db view to Sql custom table
• Enrich meeting data at import and store alongside the meeting data in Sql
custom table
• Optimize Sql custom table for max performance (index/…)
• Meetings created in Umbraco must reference a sql record
Result:
• Searching/quering a db still very fast (Optimize sql/storage for optimization)
• Content editors can still use Umbraco to add more content
• Don’t bloat the Umbraco system with nodes that don’t add any added value
|77|
External asset library (TV)Newsroom
• Most assets are referenced from an internal asset library shared
across multiple teams/units/...
• Some assets are stored externally (Rackcdn.com)
• Still use the media section for all other assets though
Challenges ?
- Images are huge, we’re talking about very high resolution images >10Mb
- Video’s are stored externally, only public API is available to fetch the content
(and thumbnail previews) (Challenge?)
|78|
External asset library (TV)Newsroom
Solution implemented
• ImageProcessor takes care of retrieving/storing/caching images from multiple
sources, both over http and https
- Requires a .axd service both http and https endpoints
- Proxy configuration is still a bit flaky (PR?)
• Offloading API request to fetch info from external source to internal server which
will return the results
- Finetune network/security
|79|
Full-text search
|80|
• Support of full-text search in 24 languages
• Boosting of particular elements of the pages
• Indexing of “composition” pages
• Indexing of external sources (PDFs, external
site)
• Fast availability of new/updated pages in the
index
Requirements
|81|
Elastic Search
Elastic SearchBackend Search API
Crawler
Apache Manifold
Frontend
Search
Crawling
Notification
|82|
• Just like Google 
• Structured information passed with:
– HTTP headers
• etag: "078de59b16c27119c670e63fa53e5b51"
– Microdata:
<time itemprop="startDate" datetime="2017-06-
08T14:45">June 8, 2:45pm</time>
– RDFa
<div profile=“http://data.consilium.europa.eu/data/public_voting/rdf/schema/Configuration"
typeof=”Article">
<span property=”
http://data.consilium.europa.eu/data/public_voting/consilium/configuration/agri”>Agriculture and
Fisheries</span>
</div>
Crawling
|83|
Import from legacy cms (E-project)
|84|
Migrate “non-structured” content from
Ektron into Umbraco
|85|
• Non-structured = custom legacy xml format
• Storage
– Content: Sql server
– Assets (images/pdf’s): on disk
• Other requirement
• Process of importing content/assets has to be repeatable in a
CI/CD environment
• Iterative development, start small, grow fast
Migrate “non-structured” content from
Ektron into Umbraco
|86|
Looking at two “migration” tools
- Cms import (@rsoeteman‘s well known package)
- Chauffeur (~Umbraco CLI tool started by @slace)
Migrate “non-structured” content from
Ektron into Umbraco
|87|
Introducing Chauffeur
”Chauffeur is a CLI for Umbraco, it will sit with your Umbraco websites bin folder and give you an
interface to which you can execute commands, known as Deliverables, against your installed
Umbraco instance.”
• Command line: perfect fit for our continuous integration/deployment scenario
• Lightweight: can be easily added or removed from your environments
– Drop assembly in /bin folder and you’re set, remove in production
– Ability to inject any Umbraco service API
– Code once, run anywhere (Build blocks of reusable deliverables)
– Create a chain of deliverables to run from (a .delivery file)
• Restrictions
- Publishing content won’t work!
Migrate “non-structured” content from
Ektron into Umbraco
|88|
Migrate “non-structured” content from
Ektron into Umbraco
|89|
Migrate “non-structured” content from
Ektron into Umbraco
For each content to be migrated
• Get record data out of the legacy Sql server database
• Create new content using Umbraco service API
• Property data transformation using custom object model and Json.net to
serialize to a “json string”
• Set property data on the new content
• Save new content in cms
Challenges
- Grid content (rte content)
- Customized Vorto implementation
- NestedContent / DocTypeGridEditor / Vorto and any possible
combinations
|90|
Migrate “non-structured” content from
Ektron into Umbraco
Deliverable transforms xml into json blob using our custom
data object model and Json.net (simplified example)
|91|
Chauffeur references
- https://our.umbraco.org/projects/collaboration/chauffeur/
- https://github.com/aaronpowell/chauffeur
- https://24days.in/umbraco-cms/2015/may-the-tools-be-with-you/
Migrate “non-structured” content from
Ektron into Umbraco
|92|
Conclusion
|93|
• First try to use what’s out of the box or on Our
• If not enough Umbraco can be heavily extended
• Umbraco can be used in “security conscious” entities
Conclusion
|94|
SUPER TAK!
|95|
?
Questions

More Related Content

PDF
Embedded Linux - Building toolchain
PDF
Netflix Architecture and Open Source
PDF
Free / Open Source EDA Tools
PPTX
Microservices at ibotta pitfalls and learnings
PDF
Migrate to Microservices Judiciously!
PDF
Cerebro general overiew eng
PDF
Interconnection Automation For All - Extended - MPS 2023
PDF
Webinar: Code Faster on Kubernetes
Embedded Linux - Building toolchain
Netflix Architecture and Open Source
Free / Open Source EDA Tools
Microservices at ibotta pitfalls and learnings
Migrate to Microservices Judiciously!
Cerebro general overiew eng
Interconnection Automation For All - Extended - MPS 2023
Webinar: Code Faster on Kubernetes

Similar to Fast and furious(ly) multilingual: Publishing of EU politics in 24 languages with Umbraco (20)

PDF
DocDokuPLM presentation - OW2Con 2015 Community Award winner
PDF
DocDoku: Using web technologies in a desktop application. OW2con'15, November...
 
PDF
Data Science in the Cloud @StitchFix
PPTX
Cerebro for vfx eng
PPTX
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
PDF
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
PDF
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
 
PDF
Serverless Toronto User Group - Let's go Serverless!
PPTX
Create Amazing Reports in OutSystems
PDF
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
PPTX
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...
PPTX
Mba i-ifm-u-2-computer software
PDF
SoC Keynote:The State of the Art in Integration Technology
PPTX
Getting Started with Innoslate
PDF
ElasTest Webinar
PDF
Docker in Production at the Aurora Team
PPTX
Hot to build continuously processing for 24/7 real-time data streaming platform?
PDF
Flux Security & Scalability using VS Code GitOps Extension
PDF
Kitware: Qt and Scientific Computing
PPTX
Brisbane MuleSoft Meetup 2023-03-22 - Anypoint Code Builder and Splunk Loggin...
DocDokuPLM presentation - OW2Con 2015 Community Award winner
DocDoku: Using web technologies in a desktop application. OW2con'15, November...
 
Data Science in the Cloud @StitchFix
Cerebro for vfx eng
Azure Service Fabric: notes from the field (Sam Vanhoute @Integrate 2016)
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
 
Serverless Toronto User Group - Let's go Serverless!
Create Amazing Reports in OutSystems
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...
Mba i-ifm-u-2-computer software
SoC Keynote:The State of the Art in Integration Technology
Getting Started with Innoslate
ElasTest Webinar
Docker in Production at the Aurora Team
Hot to build continuously processing for 24/7 real-time data streaming platform?
Flux Security & Scalability using VS Code GitOps Extension
Kitware: Qt and Scientific Computing
Brisbane MuleSoft Meetup 2023-03-22 - Anypoint Code Builder and Splunk Loggin...
Ad

More from Simone Chiaretta (10)

PDF
OpenROV: Node.js takes a dive into the ocean
PPT
La UX delle cose
PPTX
UGIALT.net Keynote
PPTX
What's new in asp.net mvc 4
PDF
FeedTso, History of a WP7 FeedReader
PPTX
Ruby on Rails vs ASP.NET MVC
PPTX
Design for testability as a way to good coding (SOLID and IoC)
PPTX
The ViewModel pattern
PPTX
ASP.NET MVC Extensibility
PPTX
Lavorare con applicazioni Brownfield: il caso di 39x27.com
OpenROV: Node.js takes a dive into the ocean
La UX delle cose
UGIALT.net Keynote
What's new in asp.net mvc 4
FeedTso, History of a WP7 FeedReader
Ruby on Rails vs ASP.NET MVC
Design for testability as a way to good coding (SOLID and IoC)
The ViewModel pattern
ASP.NET MVC Extensibility
Lavorare con applicazioni Brownfield: il caso di 39x27.com
Ad

Recently uploaded (20)

PPTX
Transform Your Business with a Software ERP System
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Digital Systems & Binary Numbers (comprehensive )
PPTX
history of c programming in notes for students .pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Introduction to Artificial Intelligence
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
top salesforce developer skills in 2025.pdf
PDF
Designing Intelligence for the Shop Floor.pdf
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
System and Network Administration Chapter 2
Transform Your Business with a Software ERP System
Understanding Forklifts - TECH EHS Solution
Design an Analysis of Algorithms II-SECS-1021-03
Operating system designcfffgfgggggggvggggggggg
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Digital Systems & Binary Numbers (comprehensive )
history of c programming in notes for students .pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Upgrade and Innovation Strategies for SAP ERP Customers
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Introduction to Artificial Intelligence
Adobe Illustrator 28.6 Crack My Vision of Vector Design
top salesforce developer skills in 2025.pdf
Designing Intelligence for the Shop Floor.pdf
CHAPTER 2 - PM Management and IT Context
Reimagine Home Health with the Power of Agentic AI​
System and Network Administration Chapter 2

Fast and furious(ly) multilingual: Publishing of EU politics in 24 languages with Umbraco

  • 1. |0| Architect @simonech Simone Chiaretta Fast and furious(ly) multilingual: Publishing of EU politics in 24 languages Council of the European Union General Secretariat Directorate-General Administration Directorate Communication and Information Systems Unit Design & Development Disclaimer: The views expressed are solely those of the speaker and may not be regarded as stating an official position of the Council of the EU Clause de non-responsabilité: Les avis exprimés n'engagent que leur auteur et ne peuvent être considérés comme une position officielle du Conseil de l'UE Umbraco Specialist @netaddicts Dirk De Grave
  • 2. |1| • One of the 3 main EU institutions together with Commission and Parliament • Made of two Councils – Council of European Union • meetings of ministries of each EU country – European Council • Head of states of each EU member state • Rotating presidency (different Member State every 6 months) • 28 Countries • 24 Official Languages Council of European Union
  • 4. |3| • Report on the work of the European Council, the Council of the EU, the Eurogroup and their presidents to citizens of the EU • Inform the public and media with press releases and news • List all meetings and meetings’ conclusion Goal of the site
  • 5. |4| • Why we moved to Umbraco • How we work • How we deploy • Scalability and system • Our beautiful editor experience • Standard based translation in 24 languages • Integration with legacy and external systems • Full-Text search • Import from old CMS Agenda
  • 7. |6| • pre-2011: Custom CMS • 2011: Umbraco v4 • Jan 2015: Redesign + Commercial CMS • 2017 Q3: Umbraco v7 Why we moved to Umbraco
  • 8. |7| • Independent study from CMS expert • PoC done with multiple CMS – Umbraco – Drupal @ EC – EPi Server • Internal evaluation Decision Process
  • 9. |8| • Faster editing and publishing process • Simple editorial experience • Better integration with translation tools • Better search • Able to handle a team of 30-ish editors Expectations
  • 10. |9| • Umbraco is not multi-lingual by default • Default translation flow is obsolete and weak • Integration of legacy satellite applications • CI and deployment • Import from old CMS Challenges
  • 12. |11| • Multi-discipline team: 2 analysts, 3 frontend devs, 6 backend devs • Scrum/sprint planning/daily standups • Atlassian stack on premise as collaboration tools for: - Analysis (Confluence) - Development (BitBucket) - CI/CD (Bamboo) - Sprint planning/issue tracking (Jira) Development/team setup
  • 14. |13| • Few options – Umbraco as a Service #UaaS – Shared database development All developers use the same database for development (doctypes/datatypes/…) – Local database development Each individual developer uses a local database, either Sql server or Sql Server CE Umbraco development setup
  • 15. |14| Decision ? • Umbraco as a Service = Nay  • Shared development = NO – Perfect for a single environment (eg. DEV) – People don’t need to sync any metadata nor content – Candidate to a cluttered database if someone forgets to delete any metadata or content that is not part of the solution • Local database development = YES – People will need to sync all metadata and “relevant” content – Perfect fit for proof of concept’ing (switching between Sql server/Sql server CE) – Perfect fit for continuous integration with multiple environments if we can find a way to synchronize metadata and content Umbraco development setup
  • 16. |15| Lightweight, can be easily fine-tuned to only sync minimal settings to get your environments in clean state for both metadata as well as content and can be automated Challenges ? - How to handle media efficiently? - Dealing with exotic datatypes - Long path names in continuous environment uSync
  • 17. |16| .Core project (Business logic) / .Core.Tests project Startup configuration (DI = Unity, IoC, event handling) PropertyValueConverters ModelsBuilder Model customizations Controllers (Route hijacking all the things) Services Automapper ViewModels Project/solution setup
  • 18. |17| .Web project Default Umbraco installation Minimal changes allowed (.config) to smooth upgrades App_Plugins for custom built and 3rd party packages (Nuget/Private Nuget) even for packages from the online repository .Frontend project All things related to UI (views/js/css) Frontend team uses their own workflow to generate assets which are copied into the .Web project .Resources project Legacy dictionary Project/solution setup
  • 19. |18| Workflow = GitFlow Main develop branch Each feature/bugfix = separate branch PR with approval = Merge Merge vs rebase! Strict rules in implementing features Features must be small Changes unrelated to feature = rejected Every feature is discussed upfront Commits / commit messages / PR messages must be very clear
  • 22. |21| Build plan kicks in for every commit on feature/bugfix branch pushed to remote repository - Build must be successful - All related tests must pass - Continuous code quality assurance (SonarQube) Build plan is only responsible for creating the required artifacts Build plan will never change anything (files/configurations) Build/Deployment pipeline (Bamboo)
  • 23. |22| Different build plans for DEV/TEST and STA/PROD DEV/TEST = 1 single artifact STA/PROD = 2 artifacts, 1x frontend and 1x backend Build/Deployment pipeline (Bamboo)
  • 25. |24| • Only if a build has been completed without errors, it becomes candidate for “release” • Release plan takes care moving the artifacts from the build to your “destination” environment • Release plan is also responsible for configuring the environment (web.config transformations, uSync) • Release can be automated (DEV) or is a manual process (TEST/STA/PROD) Build/Deployment pipeline (Bamboo)
  • 28. |27| • Back-office shielded from internet • Instant publishing of content • Performance and availability Security and publishing
  • 29. |28| Systems and caching SQL SQL UMBRACO CMS Production Environment Varnish cache servers Umbraco IIS web servers Windows File share cluster HTTP HTTP HTTP HTTP HTTP SQL HTTP SMB SMB Database cluster Internet SQL SQL Authoring/back office HTTP/HTTPS Alteon Load Balancer
  • 30. |29| • 3 level caching 1. ASP.NET and Umbraco caching 2. Varnish 3. CloudFlare (future) Caching
  • 31. |30| • Reverse Proxy • Caching based on HTTP Headers • Behavior configurable with a DSL • Possible to invalidate individual pages Varnish
  • 34. |33| • In-page editing experience • Find content easily (even with 1000’s of nodes) Main requirements
  • 37. |36| Grid / NestedContent / DocTypeGridEditor / Customized Vorto
  • 46. |45| • 1-1 translation of 24 languages • Batch management of languages • Localize just the minimum need • Export to industry standard XLIFF format • Automatic import of translation Requirements
  • 47. |46| • XML Localization Interchange File Format • The only open standard bitext format • OASIS standard since 2008 • Supported by all professional CAT tools in the market • Bitext is a file that contains both source and target languages correctly « aligned » What is XLIFF
  • 48. |47| Tyger Tyger, burning bright, Tigre! Tigre! Divampante fulgore In the forests of the night; Nelle foreste della notte, What immortal hand or eye, Quale fu l'immortale mano o l'occhio Could frame thy fearful symmetry? Ch'ebbe la forza di formare la tua agghiacciante simmetria? William Blake / Giuseppe Ungaretti What is bitext
  • 49. |48| Tyger Tyger, burning bright, Tigre! Tigre! Divampante fulgore In the forests of the night; Nelle foreste della notte, What immortal hand or eye, Quale fu l'immortale mano o l'occhio Could frame thy fearful symmetry? Ch'ebbe la forza di formare la tua agghiacciante simmetria? William Blake Giuseppe Ungaretti What is bitext
  • 50. |49| msgid "Tyger Tyger, burning bright," msgstr "Tigre! Tigre! Divampante fulgore" msgid "In the forests of the night;" msgstr "Nelle foreste della notte," msgid "What immortal hand or eye," msgstr "Quale fu l'immortale mano o l'occhio" msgid "Could frame thy fearful symmetry?" msgstr "Ch'ebbe la forza di formare la tua agghiacciante simmetria?" What is bitext
  • 51. |50| <source>Tyger Tyger, burning bright,</source> <target>Tigre! Tigre! Divampante fulgore</target> <source>In the forests of the night;</source> <target>Nelle foreste della notte,</target> <source>What immortal hand or eye,</source> <target>Quale fu l'immortale mano o l'occhio</target> <source>Could frame thy fearful symmetry?</source> <target>Ch'ebbe la forza di formare la tua agghiacciante simmetria?</target> What is bitext
  • 52. |51| <source xml:lang="EN">Tyger Tyger, burning bright,</source> <target xml:lang="IT">Tigre! Tigre! Divampante fulgore</target> <source xml:lang="EN">In the forests of the night;</source> <target xml:lang="IT">Nelle foreste della notte,</target> <source xml:lang="EN">What immortal hand or eye,</source> <target xml:lang="IT">Quale fu l'immortale mano o l'occhio</target> <source xml:lang="EN">Could frame thy fearful symmetry?</source> <target xml:lang="IT">Ch'ebbe la forza di formare la tua agghiacciante simmetria?</target> What is bitext
  • 53. |52| <unit id=1> <segment> <source xml:lang="EN">Tyger Tyger, burning bright,</source> <target xml:lang="IT">Tigre! Tigre! Divampante fulgore</target> </segment> <segment> <source xml:lang="EN">In the forests of the night;</source> <target xml:lang="IT">Nelle foreste della notte,</target> </segment> <segment> <source xml:lang="EN">What immortal hand or eye,</source> <target xml:lang="IT">Quale fu l'immortale mano o l'occhio</target> </segment> <segment> <source xml:lang="EN">Could frame thy fearful symmetry?</source> <target xml:lang="IT">Ch'ebbe la forza di formare la tua agghiacciante simmetria?</target> </segment> </unit> What is bitext
  • 54. |53| • Linked trees – PRO: default Umbraco approach to localization – CONS: everything else  • Nested nodes – PRO: meaningful history, easier to manage programmatically – CONS: not possible to sync grid structure between languages, needs for custom batch publishing actions (and much more) • Vorto – PRO: just localize what’s needed, just one node per content, one grid structure for all – CONS: loss of meaningful history, needs for custom publishing “flag” per language, more difficult to manage programmatically 3 options
  • 55. |54| • Vorto (customised) • Custom “vorto-like” grid editor • Custom translation component Solution
  • 63. |62| Extraction Umbraco Generic document structure Initial XLIFF (with HTML markup) Split paragraphs and extract inline code Segmentation Apply Translation Memory Off to Translation Workflow (SDL Studio) Enrich with custom extensions
  • 64. |63| • Complete the system  • Make the generic Extraction/Merging library OpenSource • Integrate the Umbraco specific extaction/merging into Umbraco Core https://github.com/simonech/XliffLib Future steps
  • 66. |65| List of internal/external system to interact with • PoolParty to enrich your content with valuable metadata (taxonomies) • Oracle database (MPO Meetings/Meeting planner) • (TV)Newsroom – Video API – Asset/image library • Rss feeds/twitter feeds
  • 69. |68| PoolParty Why? - Exchange taxonomy between different units within the EU Council, or even more… with the outside world and vice versa Example: A “Location” taxonomy may already exist “somewhere”, so we should be able to transparently reference this taxonomy without the need to create a new one Europe > Belgium > Brussels capital region > Brussels > …
  • 70. |69| PoolParty Automated tagging ? • Automated content tagging is possible using a 3rd party solution “Powertagging with Umbraco and PoolParty” • Didn’t really fit our requirements (legacy data, “taxonomy” currently not semantically normalize) Solution ? On demand synchronization from PoolParty to Umbraco • Limited number of syncs (~1/month) • One way sync from PoolParty -> Umbraco • Don’t rely on server availability
  • 73. |72| PoolParty Challenges ? • Enrich our sync’ed data with custom “attributes” Examples: • Set country flag for specific “location” taxonomies • Change default descriptions of a taxonomy on the frontend website – “Council of the European Union” -> “Council of the EU” Solution ? • Create a “developer centric” “taxonomy settings section” to create a link between the sync’ed taxonomy and our custom metadata
  • 75. |74| Meeting planner data (Oracle db) External tools used by other departments creating “Meeting”s Challenges: • Data stored in external database • Data is only exposed through readonly views on the Oracle db • ~3000 meetings currently in system and available online, about ~100 meetings are created monthly • Approx. 4 meetings/month need additional content editing before publishing • Link with the existing sync’ed taxonomy • Advanced search (date/taxonomy/…) Do we import this data in Umbraco ?
  • 77. |76| Meeting planner data (Oracle db) Decisions: • Don’t import any meeting data in Umbraco (you’ve got everything you need already) • Remove connection to Oracle db • Pushing data from Oracle db view to Sql custom table • Enrich meeting data at import and store alongside the meeting data in Sql custom table • Optimize Sql custom table for max performance (index/…) • Meetings created in Umbraco must reference a sql record Result: • Searching/quering a db still very fast (Optimize sql/storage for optimization) • Content editors can still use Umbraco to add more content • Don’t bloat the Umbraco system with nodes that don’t add any added value
  • 78. |77| External asset library (TV)Newsroom • Most assets are referenced from an internal asset library shared across multiple teams/units/... • Some assets are stored externally (Rackcdn.com) • Still use the media section for all other assets though Challenges ? - Images are huge, we’re talking about very high resolution images >10Mb - Video’s are stored externally, only public API is available to fetch the content (and thumbnail previews) (Challenge?)
  • 79. |78| External asset library (TV)Newsroom Solution implemented • ImageProcessor takes care of retrieving/storing/caching images from multiple sources, both over http and https - Requires a .axd service both http and https endpoints - Proxy configuration is still a bit flaky (PR?) • Offloading API request to fetch info from external source to internal server which will return the results - Finetune network/security
  • 81. |80| • Support of full-text search in 24 languages • Boosting of particular elements of the pages • Indexing of “composition” pages • Indexing of external sources (PDFs, external site) • Fast availability of new/updated pages in the index Requirements
  • 82. |81| Elastic Search Elastic SearchBackend Search API Crawler Apache Manifold Frontend Search Crawling Notification
  • 83. |82| • Just like Google  • Structured information passed with: – HTTP headers • etag: "078de59b16c27119c670e63fa53e5b51" – Microdata: <time itemprop="startDate" datetime="2017-06- 08T14:45">June 8, 2:45pm</time> – RDFa <div profile=“http://data.consilium.europa.eu/data/public_voting/rdf/schema/Configuration" typeof=”Article"> <span property=” http://data.consilium.europa.eu/data/public_voting/consilium/configuration/agri”>Agriculture and Fisheries</span> </div> Crawling
  • 84. |83| Import from legacy cms (E-project)
  • 86. |85| • Non-structured = custom legacy xml format • Storage – Content: Sql server – Assets (images/pdf’s): on disk • Other requirement • Process of importing content/assets has to be repeatable in a CI/CD environment • Iterative development, start small, grow fast Migrate “non-structured” content from Ektron into Umbraco
  • 87. |86| Looking at two “migration” tools - Cms import (@rsoeteman‘s well known package) - Chauffeur (~Umbraco CLI tool started by @slace) Migrate “non-structured” content from Ektron into Umbraco
  • 88. |87| Introducing Chauffeur ”Chauffeur is a CLI for Umbraco, it will sit with your Umbraco websites bin folder and give you an interface to which you can execute commands, known as Deliverables, against your installed Umbraco instance.” • Command line: perfect fit for our continuous integration/deployment scenario • Lightweight: can be easily added or removed from your environments – Drop assembly in /bin folder and you’re set, remove in production – Ability to inject any Umbraco service API – Code once, run anywhere (Build blocks of reusable deliverables) – Create a chain of deliverables to run from (a .delivery file) • Restrictions - Publishing content won’t work! Migrate “non-structured” content from Ektron into Umbraco
  • 90. |89| Migrate “non-structured” content from Ektron into Umbraco For each content to be migrated • Get record data out of the legacy Sql server database • Create new content using Umbraco service API • Property data transformation using custom object model and Json.net to serialize to a “json string” • Set property data on the new content • Save new content in cms Challenges - Grid content (rte content) - Customized Vorto implementation - NestedContent / DocTypeGridEditor / Vorto and any possible combinations
  • 91. |90| Migrate “non-structured” content from Ektron into Umbraco Deliverable transforms xml into json blob using our custom data object model and Json.net (simplified example)
  • 92. |91| Chauffeur references - https://our.umbraco.org/projects/collaboration/chauffeur/ - https://github.com/aaronpowell/chauffeur - https://24days.in/umbraco-cms/2015/may-the-tools-be-with-you/ Migrate “non-structured” content from Ektron into Umbraco
  • 94. |93| • First try to use what’s out of the box or on Our • If not enough Umbraco can be heavily extended • Umbraco can be used in “security conscious” entities Conclusion

Editor's Notes

  • #63: CMS is structured in tabs and properties (and sub-properties) Tabs map to Groups Properties map to Units
  • #64: From Umbraco we export a generic document structure which is extracted into a crude XLIFF doc (with many paragraphs and with HTML markup) This is then processed again and paragraphs are split and HTML converted into inline elements Then each unit is segmented Translation memory is applied to send the bitext format Finally custom extensions are applied before sending to the translation workflow and to SDL Studio
  • #65: Demo will be a pre-recorded video
  • #67: Demo will be a pre-recorded video
  • #68: Demo will be a pre-recorded video
  • #69: Demo will be a pre-recorded video
  • #70: Demo will be a pre-recorded video
  • #71: Demo will be a pre-recorded video
  • #72: Demo will be a pre-recorded video
  • #73: Demo will be a pre-recorded video
  • #74: Demo will be a pre-recorded video
  • #75: Demo will be a pre-recorded video
  • #76: Demo will be a pre-recorded video
  • #77: Demo will be a pre-recorded video
  • #78: Demo will be a pre-recorded video
  • #79: Demo will be a pre-recorded video
  • #80: Demo will be a pre-recorded video
  • #86: Demo will be a pre-recorded video
  • #87: Demo will be a pre-recorded video
  • #88: Demo will be a pre-recorded video
  • #89: Demo will be a pre-recorded video
  • #90: Demo will be a pre-recorded video
  • #91: Demo will be a pre-recorded video
  • #93: Demo will be a pre-recorded video