SlideShare a Scribd company logo
Internet Development
Experiences and Lessons
Philip Smith
BDNOG 1
23rd May 2014
Dhaka
Background
n  Internet involvement started in 1989 while at
University completing PhD in Physics
n  Got a little bit side-tracked by Unix, TCP/IP and
ethernet
n  Helped design and roll out new TCP/IP ethernet
network for Department
n  Involved in day to day operations of CAD Lab as
well as Dept public Unix servers (HP and Sun)
n  Caught the Internet bug!
How it all started
n  At end of University Post Doc in 1992
n  Job choice was lecturer or “commercial world”
n  Chose latter – job at UK’s first ISP advertised on
Usenet News uk.jobs feed
n  Applied, was successful, started at PIPEX in 1993
n  First big task – upgrade modems from standalone
9.6kbps to brand new Miracom 14.4kbps rack
mount
n  With upgradable FLASH for future standards upgrades!
In at the deep end
n  Testing testing and more
testing
n  Rackmount saved space
n  But did V.32bis work with
all customers??
First lesson
n  Apart from wishing to be back at Uni!
n  Test against customers expectations and
equipment too
n  Early v.32bis (14.4kbps) modems weren’t always
backward compatible with v.32 (9.6kbps) or older
standards
n  One manufacturer’s v.32bis didn’t always talk to
another’s v.32bis – fall back to v.32 or slower
n  Vendor’s promises and specification sheets
often didn’t completely match reality
ISP Backbones
n  In those early days, BGP was “only for experts”, so I
watched in awe
n  Learned a little about IGRP and BGPv3
n  But not enough to be conversant
n  April 1994 saw the migration from Classful to
Classless BGP
n  Beta Cisco IOS had BGPv4 in it
n  Which meant that our peering with UUNET could be
converted from BGPv3 to BGPv4
n  With the cheerful warning that “this could break the
Internet”
ISP Backbones
n  Internet didn’t break, and the whole Internet had
migrated to using classless routing by end of 1994
n  But classful days had left a mess behind
n  Large numbers of “Class Cs” still being announced
n  The CIDR Report was born to try and encourage these Class
Cs to be aggregated
n  Cisco made lots of money upgrading existing AGS and AGS+
routers from 4Mbytes to 16Mbytes of RAM to accommodate
n  ISP engineers gained lots of scars on
hands from replacing memory boards
and interfaces
BGP improvements
n  The ISP in 2014 has never had it so good!
n  In 1994/5:
n  iBGP was fully meshed
n  Routers had 16Mbytes RAM
n  Customer BGP announcements only changeable during
maintenance outages
n  BGP table took most of the available RAM in a router
n  The importance of separation of IGP/iBGP/eBGP was still not
fully appreciated
n  No such thing as a BGP community or other labour saving
configuration features
BGP improvements
n  Major US ISP backbone meltdown
n  iBGP full mesh overloaded CPUs, couldn’t be
maintained
n  Cisco introduced BGP Confederations, and a little
later Route Reflectors, into IOS
n  By this point I was running our backbone
operations
n  Colleague and I migrated from full mesh to per-
PoP Route Reflector setup in one 2 hour
maintenance window
Second Lesson
n  Migrating an entire backbone of 8 PoPs and
50+ routers from one design of routing
protocol to another design should not be
done with out planning, testing, or phasing
n  We were lucky it all “just worked”!
Peering with the “enemy”
n  Early PIPEX days saw us have our own paid capacity
to the US
n  With a couple of paid connections to Ebone (for their
“Europe” routes) and SWIPnet (as backup)
n  Paid = V Expensive
n  Interconnecting with UK competition (UKnet, Demon,
BTnet) seen as selling the family jewels! And would
be extremely bad for sales growth
n  Even though RTT, QoS, customer complaints, extreme cost
of international bandwidth, logic and commonsense said
otherwise
n  But we did connect to JANET (UK academics) – because
they were non-commercial and “nice guys”
Birth of LINX
n  Thankfully logic, commonsense, RTT, QoS and finances
prevailed over the sales fear campaign
n  The technical leadership of PIPEX, UKnet, Demon, BTnet and
JANET met and agreed an IXP was needed
n  Sweden had already got Europe’s first IX, the SE-GIX, and that
worked v nicely
n  Of course, each ISP wanted to host the IX as they had “the best
facilities”
n  Luckily agreement was made for an
independent neutral location – Telehouse
n  Telehouse was a Financial disaster-recovery
centre – they took some serious persuading
that this Internet thing was worth selling some
rack space to
Success: UK peering
n  LINX was established
n  Telehouse London
n  5 UK network operators (4 commercial, 1 academic)
n  BTnet was a bit later to the party than the others
n  First “fabric” was a redundant PIPEX 5-port ethernet hub!
n  We had just deployed our first Catalyst 1201 in our PoPs
n  Soon replaced with a Catalyst 1201 8-port 10Mbps ethernet
switch when the aggregate traffic got over about 3Mbps
n  Joined by a second one when redundancy and more capacity
was needed
Third Lesson
n  Peering is vital to the success of the Internet
n  PIPEX sales took off
n  Customer complaints about RTT and QoS disappeared
n  Our traffic across LINX was comparable to our US traffic
n  The LINX was critical in creating the UK Internet
economy
n  Microsoft European Datacentre was UK based (launched in
1995), connecting via PIPEX and BTnet to LINX
n  Our resellers became ISPs (peering at LINX, buying their
own international transit)
n  More connections: smaller ISPs, international operators,
content providers (eg BBC)
IGPs
n  IGRP was Cisco’s classful interior gateway protocol
n  Migration to EIGRP (the classless version) happened
many months after the Internet moved to BGPv4
n  Backbone point to point links were all /26s, and only visible
inside the backbone, so the classfulness didn’t matter
n  EIGRP was Cisco proprietary, and with the increasing
availability of other router platforms for access and
aggregation services, decision taken to migrate to
OSPF
n  Migration in itself was easy: EIGRP distance was 90, OSPF
distance was 110, so deployment of OSPF could be done “at
leisure”
Fourth Lesson
n  IGP migration needs to be done for a reason
n  With a documented migration and back out plan
n  With caution
n  The reasons need to be valid
n  EIGRP to OSPF in the mid 90s took us from
working scalable IGP to IOS bug central L –
Cisco’s OSPF rewrite was still half a decade away
n  UUNET was by then our parent, with a strong ISIS
heritage and recommendation
n  Cisco made sure ISIS worked, as UUNET and Sprint
needed it to do so
Network Redundancy
n  A single link of course means a single point of failure
– no redundancy
n  PIPEX had two links from UK to US
n  Cambridge to Washington
n  London to New York
n  On separate undersea cables
n  Or so BT and C&W told us
n  And therein is a long story about guarantees,
maintenance, undersea volcanoes, cable breaks, and
so on
Fifth Lesson
n  Make sure that critical international fibre
paths:
n  Are fully redundant
n  Do not cross or touch anywhere end-to-end
n  Go on the major cable systems the supplier claims
they go on
n  Are restored after maintenance
n  Have suitable geographical diversity (running in
the same duct is not diversity)
Aggregate origination
n  Aggregate needs to be generated within ISP
backbone for reachability
n  Leak subprefixes only for traffic engineering
n  “Within backbone” does not mean overseas PoP or at the
peering edge of the network
n  Remember those transatlantic cables
n  Which were redundant, going to different cities, different
PoPs, diverse paths,…
n  Having the Washington border routers originate our
aggregates wasn’t clever
Aggregate origination
n  Both transatlantic cables failed
n  Because one had been rerouted during maintenance – and
not put back
n  So both our US circuits were on the same fibre – which
broke
n  We didn’t know this – we thought the Atlantic ocean had
had a major event!
n  Our backup worked – for outbound traffic
n  But nothing came back – the best path as far as the US
Internet was concerned was via MAE-East and our UUNET
peering to our US border routers
n  Only quick solution – switch the routers off, as
remote access wasn’t possible either
Sixth lesson
n  Only originate aggregates in the core of
the network
n  We did that, on most of the backbone core
routers, to be super safe
n  But never on the border routers!!
How reliable is redundant?
n  Telehouse London was mentioned earlier
n  Following their very great reluctance to accept our PoP, and
the LINX, other ISPs started setting up PoPs in their facility
too
n  After 2-3 years, Telehouse housed most of the UK’s ISP
industry
n  The building was impressive:
n  Fibre access at opposite corners
n  Blast proof windows and a moat
n  Several levels of access security
n  3 weeks of independent diesel power, as well as external
power from two different power station grids
How reliable is redundant?
n  Technically perfect, but humans had to run it
n  One day: Maintenance of the diesel generators
n  Switch them out of the protect circuit (don’t want a power
cut to cause them to start when they were being serviced)
n  Maintenance completed – they are switched back into the
protect circuit
n  Only the operator switched off the external mains instead
n  Didn’t realise the mistake until the UPSes had run out of power
n  Switched external power back on – the resulting power surge
overloaded UPSes and power supplies of many network devices
n  News headlines: UK Internet “switched off” by
maintenance error at Telehouse
How reliable is redundant?
n  It didn’t affect us too badly:
n  Once BT and Mercury/C&W infrastructure returned we got
our customer and external links back
n  We were fortunate that our bigger routers had dual supplies,
one connected to UPS, the other to unprotected mains
n  So even though the in-room UPS had failed, when the external
mains power came back, our routers came back – and survived
the power surge
n  Other ISPs were not so lucky
n  And we had to restrain our sales folks from being too smug
n  But our MD did interview on television to point out the
merits of solid and redundant network design
Seventh lesson
n  Never believe that a totally redundant
infrastructure is that
n  Assume that each component in a network
will fail, no matter how perfect or reliable it
is claimed to be
n  Two of everything!
Bandwidth hijack
n  While we are talking about Telehouse
n  And LINX…
n  Early LINX membership rules were very restrictive
n  Had to pay £10k membership fee
n  Had to have own (proven) capacity to the US
n  Was designed to keep smaller ISPs and resellers out of the
LINX – ahem!
n  Rules eventually removed once the regulator started asking
questions – just as well!
n  But ISPs still joined, many of them our former
resellers, as well as some startups
Bandwidth hijack
n  We got a bit suspicious when one new ISP claimed
they had T3 capacity to the US a few days after we
had launched our brand new T3
n  Cisco’s Netflow quickly became our friend
n  Had just been deployed on our border routers at LINX and in
the US
n  Playing with early beta software again on critical infrastructure J
n  Stats showed outbound traffic from a customer of ours also
present at LINX (we didn’t peer with customers) was
transiting our network via LINX to the US
n  Stats showed that traffic from an AS we didn’t peer with at
MAE-East was transiting our network to this customer
n  What was going on??
Bandwidth hijack
n  What happened?
n  LINX border routers were carrying the full BGP table
n  The small ISP had pointed default route to our LINX router
n  They had another router in the US, at MAE-East, in their US
AS – and noticed that our MAE-East peering router also had
transit from UUNET
n  So pointed a default route to us across MAE-East
n  The simple fix?
n  Remove the full BGP table and default routes from our LINX
peering routers
n  Not announcing prefixes learned from peers to our border
routers
Eighth lesson
n  Peering routers are for peering
n  And should only carry the routes you wish peers
to see and be able to use
n  Border routers are for transit
n  And should only carry routes you wish your transit
providers to be able to use
The short sharp shock
n  It may have only been 5 years from 1993 to 1997
n  But the Internet adoption grew at a phenomenal rate
in those few years
n  In the early 90s it was best effort, and end users
were still very attached to private leased lines, X.25,
etc
n  By the late 90s the Internet had became big business
n  Exponential growth in learning and experiences
n  There were more than 8 lessons!
n  (Of course, this was limited to North America and
Western Europe)
Moving onwards
n  With UUNET’s global business assuming control of
and providing technical direction to all regional and
country subsidiaries, it was time to move on
n  In 1998, next stop Cisco:
n  The opportunity to “provide clue” internally on how ISPs
design, build and operate their networks
n  Provide guidance on the key ingredients they need for their
infrastructure, and IOS software features
n  All done within the company’s Consulting Engineering
function
n  The role very quickly became one of infrastructure
development
Internet development
n  Even though it was only over 5 years, I had
accumulated in-depth skillset in most aspects of ISP
design, set up, and operational best practices
n  The 90s were the formative years of the Internet and the
technologies underlying it
n  Best practices gained from experiences then form the basis
for what we have today
n  Account teams and Cisco country operations very
quickly involved me in educating Cisco ISP
customers, new and current
n  Working with a colleague, the Cisco ISP/IXP
Workshops were born
Internet development
n  Workshops:
n  Teaching IGP and BGP design and
best practices, as well as new features
n  Covered ISP network design
n  Introduced the IXP concept, and encouraged the formation
of IXes
n  Introduced latest infrastructure security BCPs
n  Early introduction to IPv6
n  Out of the workshops grew
requests for infrastructure
development support from all
around the world
Development opportunities
n  Bringing the Internet to Bhutan
n  Joining AfNOG instructor team to teach BGP and
scalable network design
n  Introducing IXPs to several countries around Asia
n  Improving the design, operation and scalability of
service provider networks all over Asia, Africa, Middle
East and the Pacific
n  Helping establishing network operations groups
(NOGs) – SANOG, PacNOG, MENOG etc
n  Growing APRICOT as the Asia Pacific region’s premier
Internet Operations Summit
NOG Development
n  Started getting more involved in helping with
gatherings of local and regional operations
community
n  APRICOT was the first experience – difficulties of
APRICOT ‘98 and ‘99 led to a refresh of the
leadership in time for APRICOT 2001
n  APRICOT growing from strength to strength – but
annual conference had 56 economies across
AsiaPac to visit!
n  Regional and Local NOGs were the only way to
scale
NOG Development
n  NZNOG and JANOG were starting
n  SANOG launched in January 2003, hosted
alongside Nepalese IT event
n  Several international “NOG experts” participated
n  Purpose (from www.sanog.org):
n  And this is a common theme for most NOGs founded
since
SANOG was started to bring together operators for educational as well as co-
operation. SANOG provides a regional forum to discuss operational issues and
technologies of interest to data operators in the South Asian Region.
Ingredients for a successful NOG
①  Reach out to community and organise a
meeting of interested participants
②  Reach out to colleagues in community and
further afield and ask them to come talk
about interesting operational things
③  Figure out seed funding and find a place to
meet
④  Commit to a 2nd NOG meeting
⑤  Have fun!
Ingredients for a successful NOG
n  Avoid:
n  Setting up lots of committees before the NOG
meets for the first time
n  Worrying about what fees to charge or discounts
to provide
n  Worrying about making a profit
n  Hiring expensive venues, event organisers
n  Providing expensive giveaways
n  Providing speaking opportunities to product
marketeers
Ingredients for a successful NOG
n  During that first meeting:
n  Solicit suggestions about the next meeting
n  Location, content, activities
n  Suggest a mailing list
n  And then set it up, encouraging participants to join
n  Encourage organisations participating to consider
future sponsorship
n  Encourage colleagues to help with various tasks
n  Organise a meeting of the folks who helped pull
the first meeting together
n  Here is the first committee, the Coordination Team
Ingredients for a successful NOG
n  After the first meeting:
n  Plan that 2nd meeting, relaxation is not allowed
n  Don’t expect lots of people to rush and help
n  NOG leadership is about being decisive and assertive
n  And can often be lonely
n  Organise the next meeting of the Coordination
Team (face to face, teleconference,…)
n  Don’t lose momentum
n  Keep the Coordination Team involved
Ingredients for a successful NOG
n  Going forwards:
n  Encourage discussion and Q&A on the mailing list
n  No question is too silly
n  Run the second meeting, plan the third
n  But don’t try and do too many per year – one or two are
usually enough
n  Don’t rely on the international community for everything
– encourage and prioritise local participation
n  Start thinking about breaking even
n  After the 2nd or 3rd meeting, assistance with
programme development – the next committee!
The final lesson?
n  Setting up a NOG takes effort and persistence
n  Bring the community along with you
n  People attend, and return, if the experience is
positive and the content is worth coming for
n  Include all sectors and regions the NOG claims to
cover
n  Budget needs to be neutral, sponsorship
generous, participant costs low
n  No bureaucracy!
The story goes on…
n  IXP experiences
n  Nepal, Bangladesh, Singapore, Vanuatu,
India, Pakistan, Uganda, PNG, Fiji, Samoa,
Thailand, Mongolia, Philippines,…
The story goes on…
n  Other ISP design and redesigns
The story goes on…
n  Satellites
n  falling out of sky
n  latency/tcp window vs performance
The story goes on…
n  Fibre optics being stolen
n  Folks thinking it is copper
The story goes on…
n  The North Sea fogs and snow which
block microwave transmission
The story goes on…
n  “You don’t understand, Philip”
n  From ISPs, regulators, business leaders,
who think their environment is unique in
the world
The story goes on…
n  “Ye cannae change the laws o’ physics!”
n  To operators and end users who complain
about RTTs
§ Montgomery “Scotty” Scott: Star Trek

More Related Content

PDF
History and Evolution of Bangladesh Internet
PDF
HKNOG 10.0: 30 Years of Internet in HK – A Quick Look Back at the First 20 Years
PDF
Cloud Telesol FTTH proposal - India
PDF
FTTH Basics & Network Design
PDF
IXP introduction
PPTX
Creating converged networks for FTTH and 5G
 
PDF
International bw and data market
PPTX
Paul Solsrud: Cooperative network services broadband considerations 111715
History and Evolution of Bangladesh Internet
HKNOG 10.0: 30 Years of Internet in HK – A Quick Look Back at the First 20 Years
Cloud Telesol FTTH proposal - India
FTTH Basics & Network Design
IXP introduction
Creating converged networks for FTTH and 5G
 
International bw and data market
Paul Solsrud: Cooperative network services broadband considerations 111715

What's hot (19)

PPTX
PDF
Economic of FTTH - Open Access Concept
PDF
Kanazia ftth presentation
PPTX
Interconnection Evolution
PDF
Influencing factors on your FTTx architecture - FTTH Europe Conference 2018 w...
PPTX
Broadband
PDF
4 g 5g mobile backhaul strategy
PDF
Superfast Wireless (Connection Vouchers, Superfast Britain, BDUK + more)
PDF
10 Gigabit Ethernet Technology - old
PPTX
FCC 2016 - DAS & Small Cells Workshop
PPTX
m2fx : How Pushable fiber is transforming Africa
PDF
Parallel Wireless Public Safety LTE Deployment Stories
PDF
Submarine Amplifiers and Systems
PDF
47525890 wimax-english
PDF
Cablenet company presentation
PDF
The Impact of 5G Network on edotco
PPT
Rgc.Deck.02.06.09
PDF
5G, SDN and MBH
PDF
Plan for all, build for 5G. Strategies for designing a heterogeneous fiber ne...
Economic of FTTH - Open Access Concept
Kanazia ftth presentation
Interconnection Evolution
Influencing factors on your FTTx architecture - FTTH Europe Conference 2018 w...
Broadband
4 g 5g mobile backhaul strategy
Superfast Wireless (Connection Vouchers, Superfast Britain, BDUK + more)
10 Gigabit Ethernet Technology - old
FCC 2016 - DAS & Small Cells Workshop
m2fx : How Pushable fiber is transforming Africa
Parallel Wireless Public Safety LTE Deployment Stories
Submarine Amplifiers and Systems
47525890 wimax-english
Cablenet company presentation
The Impact of 5G Network on edotco
Rgc.Deck.02.06.09
5G, SDN and MBH
Plan for all, build for 5G. Strategies for designing a heterogeneous fiber ne...
Ad

Viewers also liked (19)

PDF
Broadband for Digital Bangladesh & recommendation from ISPAB
DOCX
Optical Fiber Communication & Bangladesh
PDF
Cyber Security law in Bangladesh
PDF
India :Telecommunication Sector Report_August 2013
PDF
ISP status in Bangladesh 2016
PDF
IP Transit : Simple Math - Simple Calculation
PPTX
Children of the internet
PPTX
Design and Development of Internet System for Residential Smart-Grid
PPT
Internet for Development
PPTX
Global Cyber Security trend & impact of Internet on the society of Bangladesh...
PDF
Fast Convergence Techniques
PPT
Africa telecom market report, 2010 2011
PDF
Cyber security Awareness: In perspective of Bangladesh
PDF
Awareness of Children Internet Addiction
PDF
Introduction to Development for the Internet
PPT
Internet Safety Presentation
PPT
Internet Safety for Children Powerpoint
PPTX
Cyber security presentation
PPTX
Cyber crime ppt
Broadband for Digital Bangladesh & recommendation from ISPAB
Optical Fiber Communication & Bangladesh
Cyber Security law in Bangladesh
India :Telecommunication Sector Report_August 2013
ISP status in Bangladesh 2016
IP Transit : Simple Math - Simple Calculation
Children of the internet
Design and Development of Internet System for Residential Smart-Grid
Internet for Development
Global Cyber Security trend & impact of Internet on the society of Bangladesh...
Fast Convergence Techniques
Africa telecom market report, 2010 2011
Cyber security Awareness: In perspective of Bangladesh
Awareness of Children Internet Addiction
Introduction to Development for the Internet
Internet Safety Presentation
Internet Safety for Children Powerpoint
Cyber security presentation
Cyber crime ppt
Ad

Similar to Internet Development Experiences and Lessons (20)

PDF
IXP Design and Operational BCP
PDF
Introdution to networking
PDF
Internet Exchange Points, by Philip Smith [APNIC 38 / ISOC-AU]
PDF
Computer Networking, OSI 7 Layer and TCP/IP
PPTX
Lect1_Master computer Networks (1) intro +RFCs.pptx
PDF
Technical and Operational Aspects of Regional Internet Exchange Model
PDF
IPv6 Transition Techniques
PDF
Week11
PDF
Week11
PPT
Computer network (Lecture 1)
DOCX
THE INTERNET AND NETWORK STANARDS
PDF
Can today’s Internet protocols deliver URLLC?
PDF
Computer Networks and brief introduction for BCA 3
PPTX
Reconstructing computer networking with RINA: how solid scientific foundation...
PPTX
RINA Tutorial @ IEEE Globecom 2014
PDF
PacNOG 31: Internet Exchange Points
PDF
PITA 27th AGM & Business Forum Expo 23: Internet Exchange Points
PDF
Introduction to Computer Networking
PDF
Rethinking Mobile Backhaul Offering for a Fixed Operator like Colt
PPT
6 1 ngnfsdfsadfsd
IXP Design and Operational BCP
Introdution to networking
Internet Exchange Points, by Philip Smith [APNIC 38 / ISOC-AU]
Computer Networking, OSI 7 Layer and TCP/IP
Lect1_Master computer Networks (1) intro +RFCs.pptx
Technical and Operational Aspects of Regional Internet Exchange Model
IPv6 Transition Techniques
Week11
Week11
Computer network (Lecture 1)
THE INTERNET AND NETWORK STANARDS
Can today’s Internet protocols deliver URLLC?
Computer Networks and brief introduction for BCA 3
Reconstructing computer networking with RINA: how solid scientific foundation...
RINA Tutorial @ IEEE Globecom 2014
PacNOG 31: Internet Exchange Points
PITA 27th AGM & Business Forum Expo 23: Internet Exchange Points
Introduction to Computer Networking
Rethinking Mobile Backhaul Offering for a Fixed Operator like Colt
6 1 ngnfsdfsadfsd

More from Bangladesh Network Operators Group (20)

PDF
DNS & DNSSEC operational best practices - Sleep better at night with KINDNS i...
PDF
IPv6 Mostly Experience at APRICOT by Yoshinobu Matsuzaki (IIJ)
PDF
Fast Reroute in SR-MPLS by Md Abdullah Al Naser
PDF
DDoS Mitigation Strategies by Md. Abdul Awal
PDF
DNS & DNSSEC operational best practices - Sleep better at night with KINDNS i...
PDF
Optics101 for non-Optical (IP) folks by Tashi Phuntsho
PPTX
The Internet Service Providers and Connectivity Providers of ICANN
PPTX
Integration of AI and GenAI in Education and beyond
PPTX
Strengthening Cyber Security with Tools and Human Expertise
PDF
Mental Health and Workplace Culture in Tech:A Personal Perspective
PDF
Network Efficiency:The LLM Advantage on network infrastructures
PDF
Utilizing Free and open-source Technology and Achieve Next Generation Enterpr...
PPTX
BDNOG17 Plenary Session, Security Concerns: A perspective in Smart Bangladesh
PPTX
Maximizing Network Efficiency with Large Language Models (LLM)
PPTX
Geolocation and Geofeed Implementation bdNOG18
PDF
Data Centre Design Consideration for Bangladesh
PDF
DNS Troubleshooting - Assumptions and Problem Breakdown
PPTX
Team Cymru Community Services,Overview of all public services
PPTX
Open Source TCP or Netflow Log Server Using Graylog
PPTX
Enhancing seamless access using TIGERfed
DNS & DNSSEC operational best practices - Sleep better at night with KINDNS i...
IPv6 Mostly Experience at APRICOT by Yoshinobu Matsuzaki (IIJ)
Fast Reroute in SR-MPLS by Md Abdullah Al Naser
DDoS Mitigation Strategies by Md. Abdul Awal
DNS & DNSSEC operational best practices - Sleep better at night with KINDNS i...
Optics101 for non-Optical (IP) folks by Tashi Phuntsho
The Internet Service Providers and Connectivity Providers of ICANN
Integration of AI and GenAI in Education and beyond
Strengthening Cyber Security with Tools and Human Expertise
Mental Health and Workplace Culture in Tech:A Personal Perspective
Network Efficiency:The LLM Advantage on network infrastructures
Utilizing Free and open-source Technology and Achieve Next Generation Enterpr...
BDNOG17 Plenary Session, Security Concerns: A perspective in Smart Bangladesh
Maximizing Network Efficiency with Large Language Models (LLM)
Geolocation and Geofeed Implementation bdNOG18
Data Centre Design Consideration for Bangladesh
DNS Troubleshooting - Assumptions and Problem Breakdown
Team Cymru Community Services,Overview of all public services
Open Source TCP or Netflow Log Server Using Graylog
Enhancing seamless access using TIGERfed

Recently uploaded (20)

PPTX
Slides PPTX World Game (s) Eco Economic Epochs.pptx
PDF
Paper PDF World Game (s) Great Redesign.pdf
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PDF
💰 𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓 💰
PDF
Introduction to the IoT system, how the IoT system works
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PPTX
Internet___Basics___Styled_ presentation
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PPTX
Funds Management Learning Material for Beg
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
PPTX
PptxGenJS_Demo_Chart_20250317130215833.pptx
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPTX
artificial intelligence overview of it and more
Slides PPTX World Game (s) Eco Economic Epochs.pptx
Paper PDF World Game (s) Great Redesign.pdf
Introuction about ICD -10 and ICD-11 PPT.pptx
💰 𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓 💰
Introduction to the IoT system, how the IoT system works
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
Job_Card_System_Styled_lorem_ipsum_.pptx
SASE Traffic Flow - ZTNA Connector-1.pdf
Internet___Basics___Styled_ presentation
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
Slides PDF The World Game (s) Eco Economic Epochs.pdf
INTERNET------BASICS-------UPDATED PPT PRESENTATION
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
522797556-Unit-2-Temperature-measurement-1-1.pptx
Decoding a Decade: 10 Years of Applied CTI Discipline
Funds Management Learning Material for Beg
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
PptxGenJS_Demo_Chart_20250317130215833.pptx
The New Creative Director: How AI Tools for Social Media Content Creation Are...
artificial intelligence overview of it and more

Internet Development Experiences and Lessons

  • 1. Internet Development Experiences and Lessons Philip Smith BDNOG 1 23rd May 2014 Dhaka
  • 2. Background n  Internet involvement started in 1989 while at University completing PhD in Physics n  Got a little bit side-tracked by Unix, TCP/IP and ethernet n  Helped design and roll out new TCP/IP ethernet network for Department n  Involved in day to day operations of CAD Lab as well as Dept public Unix servers (HP and Sun) n  Caught the Internet bug!
  • 3. How it all started n  At end of University Post Doc in 1992 n  Job choice was lecturer or “commercial world” n  Chose latter – job at UK’s first ISP advertised on Usenet News uk.jobs feed n  Applied, was successful, started at PIPEX in 1993 n  First big task – upgrade modems from standalone 9.6kbps to brand new Miracom 14.4kbps rack mount n  With upgradable FLASH for future standards upgrades!
  • 4. In at the deep end n  Testing testing and more testing n  Rackmount saved space n  But did V.32bis work with all customers??
  • 5. First lesson n  Apart from wishing to be back at Uni! n  Test against customers expectations and equipment too n  Early v.32bis (14.4kbps) modems weren’t always backward compatible with v.32 (9.6kbps) or older standards n  One manufacturer’s v.32bis didn’t always talk to another’s v.32bis – fall back to v.32 or slower n  Vendor’s promises and specification sheets often didn’t completely match reality
  • 6. ISP Backbones n  In those early days, BGP was “only for experts”, so I watched in awe n  Learned a little about IGRP and BGPv3 n  But not enough to be conversant n  April 1994 saw the migration from Classful to Classless BGP n  Beta Cisco IOS had BGPv4 in it n  Which meant that our peering with UUNET could be converted from BGPv3 to BGPv4 n  With the cheerful warning that “this could break the Internet”
  • 7. ISP Backbones n  Internet didn’t break, and the whole Internet had migrated to using classless routing by end of 1994 n  But classful days had left a mess behind n  Large numbers of “Class Cs” still being announced n  The CIDR Report was born to try and encourage these Class Cs to be aggregated n  Cisco made lots of money upgrading existing AGS and AGS+ routers from 4Mbytes to 16Mbytes of RAM to accommodate n  ISP engineers gained lots of scars on hands from replacing memory boards and interfaces
  • 8. BGP improvements n  The ISP in 2014 has never had it so good! n  In 1994/5: n  iBGP was fully meshed n  Routers had 16Mbytes RAM n  Customer BGP announcements only changeable during maintenance outages n  BGP table took most of the available RAM in a router n  The importance of separation of IGP/iBGP/eBGP was still not fully appreciated n  No such thing as a BGP community or other labour saving configuration features
  • 9. BGP improvements n  Major US ISP backbone meltdown n  iBGP full mesh overloaded CPUs, couldn’t be maintained n  Cisco introduced BGP Confederations, and a little later Route Reflectors, into IOS n  By this point I was running our backbone operations n  Colleague and I migrated from full mesh to per- PoP Route Reflector setup in one 2 hour maintenance window
  • 10. Second Lesson n  Migrating an entire backbone of 8 PoPs and 50+ routers from one design of routing protocol to another design should not be done with out planning, testing, or phasing n  We were lucky it all “just worked”!
  • 11. Peering with the “enemy” n  Early PIPEX days saw us have our own paid capacity to the US n  With a couple of paid connections to Ebone (for their “Europe” routes) and SWIPnet (as backup) n  Paid = V Expensive n  Interconnecting with UK competition (UKnet, Demon, BTnet) seen as selling the family jewels! And would be extremely bad for sales growth n  Even though RTT, QoS, customer complaints, extreme cost of international bandwidth, logic and commonsense said otherwise n  But we did connect to JANET (UK academics) – because they were non-commercial and “nice guys”
  • 12. Birth of LINX n  Thankfully logic, commonsense, RTT, QoS and finances prevailed over the sales fear campaign n  The technical leadership of PIPEX, UKnet, Demon, BTnet and JANET met and agreed an IXP was needed n  Sweden had already got Europe’s first IX, the SE-GIX, and that worked v nicely n  Of course, each ISP wanted to host the IX as they had “the best facilities” n  Luckily agreement was made for an independent neutral location – Telehouse n  Telehouse was a Financial disaster-recovery centre – they took some serious persuading that this Internet thing was worth selling some rack space to
  • 13. Success: UK peering n  LINX was established n  Telehouse London n  5 UK network operators (4 commercial, 1 academic) n  BTnet was a bit later to the party than the others n  First “fabric” was a redundant PIPEX 5-port ethernet hub! n  We had just deployed our first Catalyst 1201 in our PoPs n  Soon replaced with a Catalyst 1201 8-port 10Mbps ethernet switch when the aggregate traffic got over about 3Mbps n  Joined by a second one when redundancy and more capacity was needed
  • 14. Third Lesson n  Peering is vital to the success of the Internet n  PIPEX sales took off n  Customer complaints about RTT and QoS disappeared n  Our traffic across LINX was comparable to our US traffic n  The LINX was critical in creating the UK Internet economy n  Microsoft European Datacentre was UK based (launched in 1995), connecting via PIPEX and BTnet to LINX n  Our resellers became ISPs (peering at LINX, buying their own international transit) n  More connections: smaller ISPs, international operators, content providers (eg BBC)
  • 15. IGPs n  IGRP was Cisco’s classful interior gateway protocol n  Migration to EIGRP (the classless version) happened many months after the Internet moved to BGPv4 n  Backbone point to point links were all /26s, and only visible inside the backbone, so the classfulness didn’t matter n  EIGRP was Cisco proprietary, and with the increasing availability of other router platforms for access and aggregation services, decision taken to migrate to OSPF n  Migration in itself was easy: EIGRP distance was 90, OSPF distance was 110, so deployment of OSPF could be done “at leisure”
  • 16. Fourth Lesson n  IGP migration needs to be done for a reason n  With a documented migration and back out plan n  With caution n  The reasons need to be valid n  EIGRP to OSPF in the mid 90s took us from working scalable IGP to IOS bug central L – Cisco’s OSPF rewrite was still half a decade away n  UUNET was by then our parent, with a strong ISIS heritage and recommendation n  Cisco made sure ISIS worked, as UUNET and Sprint needed it to do so
  • 17. Network Redundancy n  A single link of course means a single point of failure – no redundancy n  PIPEX had two links from UK to US n  Cambridge to Washington n  London to New York n  On separate undersea cables n  Or so BT and C&W told us n  And therein is a long story about guarantees, maintenance, undersea volcanoes, cable breaks, and so on
  • 18. Fifth Lesson n  Make sure that critical international fibre paths: n  Are fully redundant n  Do not cross or touch anywhere end-to-end n  Go on the major cable systems the supplier claims they go on n  Are restored after maintenance n  Have suitable geographical diversity (running in the same duct is not diversity)
  • 19. Aggregate origination n  Aggregate needs to be generated within ISP backbone for reachability n  Leak subprefixes only for traffic engineering n  “Within backbone” does not mean overseas PoP or at the peering edge of the network n  Remember those transatlantic cables n  Which were redundant, going to different cities, different PoPs, diverse paths,… n  Having the Washington border routers originate our aggregates wasn’t clever
  • 20. Aggregate origination n  Both transatlantic cables failed n  Because one had been rerouted during maintenance – and not put back n  So both our US circuits were on the same fibre – which broke n  We didn’t know this – we thought the Atlantic ocean had had a major event! n  Our backup worked – for outbound traffic n  But nothing came back – the best path as far as the US Internet was concerned was via MAE-East and our UUNET peering to our US border routers n  Only quick solution – switch the routers off, as remote access wasn’t possible either
  • 21. Sixth lesson n  Only originate aggregates in the core of the network n  We did that, on most of the backbone core routers, to be super safe n  But never on the border routers!!
  • 22. How reliable is redundant? n  Telehouse London was mentioned earlier n  Following their very great reluctance to accept our PoP, and the LINX, other ISPs started setting up PoPs in their facility too n  After 2-3 years, Telehouse housed most of the UK’s ISP industry n  The building was impressive: n  Fibre access at opposite corners n  Blast proof windows and a moat n  Several levels of access security n  3 weeks of independent diesel power, as well as external power from two different power station grids
  • 23. How reliable is redundant? n  Technically perfect, but humans had to run it n  One day: Maintenance of the diesel generators n  Switch them out of the protect circuit (don’t want a power cut to cause them to start when they were being serviced) n  Maintenance completed – they are switched back into the protect circuit n  Only the operator switched off the external mains instead n  Didn’t realise the mistake until the UPSes had run out of power n  Switched external power back on – the resulting power surge overloaded UPSes and power supplies of many network devices n  News headlines: UK Internet “switched off” by maintenance error at Telehouse
  • 24. How reliable is redundant? n  It didn’t affect us too badly: n  Once BT and Mercury/C&W infrastructure returned we got our customer and external links back n  We were fortunate that our bigger routers had dual supplies, one connected to UPS, the other to unprotected mains n  So even though the in-room UPS had failed, when the external mains power came back, our routers came back – and survived the power surge n  Other ISPs were not so lucky n  And we had to restrain our sales folks from being too smug n  But our MD did interview on television to point out the merits of solid and redundant network design
  • 25. Seventh lesson n  Never believe that a totally redundant infrastructure is that n  Assume that each component in a network will fail, no matter how perfect or reliable it is claimed to be n  Two of everything!
  • 26. Bandwidth hijack n  While we are talking about Telehouse n  And LINX… n  Early LINX membership rules were very restrictive n  Had to pay £10k membership fee n  Had to have own (proven) capacity to the US n  Was designed to keep smaller ISPs and resellers out of the LINX – ahem! n  Rules eventually removed once the regulator started asking questions – just as well! n  But ISPs still joined, many of them our former resellers, as well as some startups
  • 27. Bandwidth hijack n  We got a bit suspicious when one new ISP claimed they had T3 capacity to the US a few days after we had launched our brand new T3 n  Cisco’s Netflow quickly became our friend n  Had just been deployed on our border routers at LINX and in the US n  Playing with early beta software again on critical infrastructure J n  Stats showed outbound traffic from a customer of ours also present at LINX (we didn’t peer with customers) was transiting our network via LINX to the US n  Stats showed that traffic from an AS we didn’t peer with at MAE-East was transiting our network to this customer n  What was going on??
  • 28. Bandwidth hijack n  What happened? n  LINX border routers were carrying the full BGP table n  The small ISP had pointed default route to our LINX router n  They had another router in the US, at MAE-East, in their US AS – and noticed that our MAE-East peering router also had transit from UUNET n  So pointed a default route to us across MAE-East n  The simple fix? n  Remove the full BGP table and default routes from our LINX peering routers n  Not announcing prefixes learned from peers to our border routers
  • 29. Eighth lesson n  Peering routers are for peering n  And should only carry the routes you wish peers to see and be able to use n  Border routers are for transit n  And should only carry routes you wish your transit providers to be able to use
  • 30. The short sharp shock n  It may have only been 5 years from 1993 to 1997 n  But the Internet adoption grew at a phenomenal rate in those few years n  In the early 90s it was best effort, and end users were still very attached to private leased lines, X.25, etc n  By the late 90s the Internet had became big business n  Exponential growth in learning and experiences n  There were more than 8 lessons! n  (Of course, this was limited to North America and Western Europe)
  • 31. Moving onwards n  With UUNET’s global business assuming control of and providing technical direction to all regional and country subsidiaries, it was time to move on n  In 1998, next stop Cisco: n  The opportunity to “provide clue” internally on how ISPs design, build and operate their networks n  Provide guidance on the key ingredients they need for their infrastructure, and IOS software features n  All done within the company’s Consulting Engineering function n  The role very quickly became one of infrastructure development
  • 32. Internet development n  Even though it was only over 5 years, I had accumulated in-depth skillset in most aspects of ISP design, set up, and operational best practices n  The 90s were the formative years of the Internet and the technologies underlying it n  Best practices gained from experiences then form the basis for what we have today n  Account teams and Cisco country operations very quickly involved me in educating Cisco ISP customers, new and current n  Working with a colleague, the Cisco ISP/IXP Workshops were born
  • 33. Internet development n  Workshops: n  Teaching IGP and BGP design and best practices, as well as new features n  Covered ISP network design n  Introduced the IXP concept, and encouraged the formation of IXes n  Introduced latest infrastructure security BCPs n  Early introduction to IPv6 n  Out of the workshops grew requests for infrastructure development support from all around the world
  • 34. Development opportunities n  Bringing the Internet to Bhutan n  Joining AfNOG instructor team to teach BGP and scalable network design n  Introducing IXPs to several countries around Asia n  Improving the design, operation and scalability of service provider networks all over Asia, Africa, Middle East and the Pacific n  Helping establishing network operations groups (NOGs) – SANOG, PacNOG, MENOG etc n  Growing APRICOT as the Asia Pacific region’s premier Internet Operations Summit
  • 35. NOG Development n  Started getting more involved in helping with gatherings of local and regional operations community n  APRICOT was the first experience – difficulties of APRICOT ‘98 and ‘99 led to a refresh of the leadership in time for APRICOT 2001 n  APRICOT growing from strength to strength – but annual conference had 56 economies across AsiaPac to visit! n  Regional and Local NOGs were the only way to scale
  • 36. NOG Development n  NZNOG and JANOG were starting n  SANOG launched in January 2003, hosted alongside Nepalese IT event n  Several international “NOG experts” participated n  Purpose (from www.sanog.org): n  And this is a common theme for most NOGs founded since SANOG was started to bring together operators for educational as well as co- operation. SANOG provides a regional forum to discuss operational issues and technologies of interest to data operators in the South Asian Region.
  • 37. Ingredients for a successful NOG ①  Reach out to community and organise a meeting of interested participants ②  Reach out to colleagues in community and further afield and ask them to come talk about interesting operational things ③  Figure out seed funding and find a place to meet ④  Commit to a 2nd NOG meeting ⑤  Have fun!
  • 38. Ingredients for a successful NOG n  Avoid: n  Setting up lots of committees before the NOG meets for the first time n  Worrying about what fees to charge or discounts to provide n  Worrying about making a profit n  Hiring expensive venues, event organisers n  Providing expensive giveaways n  Providing speaking opportunities to product marketeers
  • 39. Ingredients for a successful NOG n  During that first meeting: n  Solicit suggestions about the next meeting n  Location, content, activities n  Suggest a mailing list n  And then set it up, encouraging participants to join n  Encourage organisations participating to consider future sponsorship n  Encourage colleagues to help with various tasks n  Organise a meeting of the folks who helped pull the first meeting together n  Here is the first committee, the Coordination Team
  • 40. Ingredients for a successful NOG n  After the first meeting: n  Plan that 2nd meeting, relaxation is not allowed n  Don’t expect lots of people to rush and help n  NOG leadership is about being decisive and assertive n  And can often be lonely n  Organise the next meeting of the Coordination Team (face to face, teleconference,…) n  Don’t lose momentum n  Keep the Coordination Team involved
  • 41. Ingredients for a successful NOG n  Going forwards: n  Encourage discussion and Q&A on the mailing list n  No question is too silly n  Run the second meeting, plan the third n  But don’t try and do too many per year – one or two are usually enough n  Don’t rely on the international community for everything – encourage and prioritise local participation n  Start thinking about breaking even n  After the 2nd or 3rd meeting, assistance with programme development – the next committee!
  • 42. The final lesson? n  Setting up a NOG takes effort and persistence n  Bring the community along with you n  People attend, and return, if the experience is positive and the content is worth coming for n  Include all sectors and regions the NOG claims to cover n  Budget needs to be neutral, sponsorship generous, participant costs low n  No bureaucracy!
  • 43. The story goes on… n  IXP experiences n  Nepal, Bangladesh, Singapore, Vanuatu, India, Pakistan, Uganda, PNG, Fiji, Samoa, Thailand, Mongolia, Philippines,…
  • 44. The story goes on… n  Other ISP design and redesigns
  • 45. The story goes on… n  Satellites n  falling out of sky n  latency/tcp window vs performance
  • 46. The story goes on… n  Fibre optics being stolen n  Folks thinking it is copper
  • 47. The story goes on… n  The North Sea fogs and snow which block microwave transmission
  • 48. The story goes on… n  “You don’t understand, Philip” n  From ISPs, regulators, business leaders, who think their environment is unique in the world
  • 49. The story goes on… n  “Ye cannae change the laws o’ physics!” n  To operators and end users who complain about RTTs § Montgomery “Scotty” Scott: Star Trek