Ga naar de inhoud

PDF is not evil. Ignorance is.

Door Iacobien Riezebosch van Iacobien.nl

PDFs are evil. This is a sentiment you're hearing more and more. You shouldn't be publishing PDFs on government websites. However, not publishing PDFs by definition means you're no longer thinking about a solid content strategy and when a specific file format is a wise choice.

Slides

PDF is not evil. Ignorance is. (857 kB)

Video

For this video a transcript and captions are available.

About the session

We often hear that PDF is bad and that we shouldn't use them. But the problem is not the format. First, it's the lack of attention from software engineers. And second, we need a content strategy for PDF. In a lot of cases the content of a PDF is not good. It lacks a good information structure and the language is miserable.

PDF has enough possibilities to deliver accessible and usable content. We just need more effort to improve the software and the content.

About Iacobien Riezebosch

Iacobien is an authority on PDF and digital accessibility. Her focus is on organization-wide approaches to digital accessibility and complex issues. Such as the issue of inaccessible PDFs and setting up appropriate management organizations. Also, Iacobien wrote several articles about accessibility.

You can find more on her website, Iacobien.nl.

Toegankelijke bestanden leren maken?

Wij geven hierover diverse trainingen en opleidingen:

Transcript

[LARISSA] Without further ado, I really would like to introduce you to our next and also last keynote speaker of the morning. Welcome, Iacobien. I think a lot of you already know Iacobien. Iacobien Riezebosch.

You are a freelance consultant, author and speaker who works in digital accessibility for already 23 years, which is impressive.

It's quite a long year. I was 8 then. I feel quite a difference. You are also a member of the PDF association working group for new techniques for accessible PDFs since 2018, which is already 7 years.

[IACOBIEN] Yes. And we published the first techniques and the basics and the glossary, the first fundamentals and the techniques for headings. As you might know, the techniques from the W3C are outdated and these will be replacing them.

[LARISSA] You do a really wonderful work.

[IACOBIEN] With a lot of wonderful people.

[LARISSA] I can imagine that with already seven years in that working group that you have gathered so much experience, which makes you, I think, the perfect person to share all the knowledge today. Enjoy and go ahead.

[IACOBIEN] Thank you.

[APPLAUSE]

PDF is not evil, ignorance is. I have an image on the screen with a desk with a sign on it that says "Policy no more PDFs" as you might be familiar with that policies. In the background there are two groups of PDFs with arms and legs approaching the desk.

Nowadays people are saying PDFs should be banned from government websites and that PDFs are usually not user-friendly, often accessible, and should not be published online. This decision is often made in desperation and not based on the correct information and the actual problems.

Today I want to share with you why PDF in itself is not evil and address the real issues. There are experts in the room, many of you advise decision makers and I want to help you and support you with advising them correctly with up-to-date correct information.

This was my longest sheet. I'm happy to help you and also to dispel some myths. This is funny because we tested this and now I just... It doesn't work. It's peculiar. It works. I fixed it. Yeah, I think I did.

Well, I did on my screen, but not on the big screen, so. Oh yeah, there it goes. Yeah. What is PDF? PDF is the portable document format. And think of PDF like a box with everything in it which makes it portable.

And there's everything in this box to make it... You can present it anywhere, anytime, in the right way, so it travels with everything it needs. And it could work good on every device.

I can compare government websites with a pool and on the screen, there is a pool with not people jumping in, but PDFs jumping in. As you might be able to tell, there's no lifeguard on duty.

This is like a pool full of teenagers who like to have fun and jump in and nobody is stopping them. The website is crowded with PDFs, and they have not been checked and there's no effective control.

Often there's a gatekeeper approach, which means there's a gatekeeper and in the image you can see a web editor with a text web editor on the back of her uniform and there's a gate, and she's trying to stop the PDFs who are storming the gates and are mostly unhappy and angry because they're bad PDFs, and they want to go online.

This gatekeeper approach usually doesn't work. There are one or two gatekeepers, and they have to stop all the PDFs in the entire organization. People do not like the gatekeeper because she's from "Team No", she's the PDF police and people try to work around her and they do.

Management is interfering by pressuring the gatekeeper to publish this wonderful PDF online, which is a nightmare for accessibility and usability. There are just way too many PDFs for the gatekeeper to control. It's an end of line approach, and it doesn't work. What is the reaction?

Because there are many, many PDFs online that were not meant for online use, that have bad content, that are not good for online. A lot of reactions are like in this article, "PDFs are Evil, Lazy, Sinful, Slothful" by Jerry McGovern and other articles, "Bye bye PDF, hello HTML!"

And there are tons of articles like this and actually these are reactions against bad content and not against PDF in itself. But usually all the PDFs are print PDFs or bad content. It's more against the content than the PDF itself.

It's a symptom of an organisation not functioning in this field. On screen, you can now see a woman who is trying to ban a PDF with a bulb of garlic. I think banning PDFs is like throwing the baby out with the bath water.

That means that you also get rid of usable stuff or good stuff. I want to warn against overreacting. That's oversimplifying the problem. The problem is PDF. Banned PDFs, problem solved. Choosing an unfeasible and unwanted approach, discarding all positives because of misuse and no control and losing the valuable elements to fix a problem.

What we are doing is we are barking up the wrong tree. And barking up the tree has a meaning. In the image you can see two trees. On the left tree there's a pheasant in it. And the right tree is empty, and a dog is barking up the empty tree.

This expression means that you're directing your attention, efforts or blame at the wrong thing. That's what I think we're doing, we're blaming PDF. And that's not the issue. There are some myths. First myth is PDF is for prints.

Yes, PDF comes from 1993 where it was developed to print documents. But then in 2001, so that's 24 years ago, PDF 1.4 came out which was tagged PDF. Tagged PDF means a PDF can be made accessible.

It's not for print alone or not at all. It can be used for print, but it's not a file format just for print.

Then in PDF 1.7, that was published in 2008. That was a full ISO standard and now we have PDF 2.0 published in 2017, and that's the first post Adobe ISO standard. We're still saying PDF is for print. Well, things have changed and evolved.

The first publication of the Matterhorn protocol, which some of you might know, some might not know, that's the protocol to test PDFs, and it has tons of criteria to test on, and it's defined which you can test as a human and which you can test with a machine over 130 criteria.

That was already published in 2013. The file format itself has evolved. There are other problems like the PDFs readers not doing what they're supposed to do. What is the actual problem? Well, the actual problem is it's a multi-headed problem.

I want to introduce you to some elements of this multi-headed monster. The first one is no control over the ecosystem. PDF has a very complex ecosystem around it. In a CMS system you have more control over web pages and people are restricted.

PDF for authors is like the Wild West. They can do whatever they want and usually nobody is stopping them. They can choose their own colour combinations even if they're not supposed to. They can use the software whatever they like.

There are many different types of software, many different authors, there are different test tools. It's very complex to get control of the ecosystem and that needs a lot of effort. And usually organisations don't have this control yet.

Another thing is lack of knowledge, and that's lack of knowledge about the file format, about the standards, about the software, about the users, about the possibilities, about the restriction of the tools we use. And that's at a management level.

That's why they often interfere and force you to publish inaccessible PDF. But also at a tactical level, where I see that a lot of people are really doing their best, but are still using the wrong tools or don't have the right information yet.

There's no clear distinction between the decisions that should be made by management and on the technical level. For example, on a tactical level, a strategy is made, and on management level, somebody says this PDF should go online without checking with the editors if it's a good PDF.

Often you see also that IT is not enabling employees to do their work. They cannot instal the right tools, not the right testing tools. The organisation buys tools that cannot create an accessible PDF. There's really not a good test process in many organisations.

On this image you can see a police line-up, and instead of people in the line-up, there are five PDFs in the line-up with arms and legs, and they have rich content, but are very different in appearance. Without a good testing process, you cannot tell the good guys from the bad guys.

Sometimes there's only an automatic test, sometimes there's only a test on WCAG which is insufficient for PDF. Sometimes it's just the wrong tool or the wrong test.

If you cannot tell which PDFs are good or bad or what the actual problems are, if you don't identify all the problems, you can also not fix them. Then people remediate, but the file is still not accessible.

We also have a very strong focus on remediation, which means fixing a PDF. That's a blind spot as well, that we think that's the solution. We have solutions that don't meet legal requirements.

Like for example, a PDF page that's inaccessible with an alternative in HTML that just contains the summary that's not accessible. It doesn't comply with WCAG, where there are requirements about an alternative.

It's not in the Netherlands, you have to fill out an accessibility statement, and then you need to give full alternatives for inaccessible content and also mention what. What is not accessible in this PDF.

Just putting up a summary in HTML excludes people. I want to be very clear, this is unacceptable for organisations. I know people on an individual level do really their best within the time they have, but as an organization, this is inaccessible, and it's the law to do better.

Default tools don't deliver. There are problems with, for example, Microsoft Office export to PDF. Also, depending on the platform, you use the version on Windows the latest version is relatively the best version, but on a Mac it's less.

On the online version I cannot even fill in a document title, which is the basics of accessible documents. That's really bad.

Microsoft always gives us breadcrumbs that things are getting better, but a little bit better is just not good enough if you don't, if you're not even able to fill out a title, or if your Mac version needs an online export. Adobe has some problems too.

A lot of people think they can use the quick check in Adobe to determine whether a PDF is accessible. That's just not enough. It's such a small test, it doesn't say anything about the entire PDF.

Also, there are some tools in Adobe Acrobat Pro that are not following standards, like Reflow, for example. It doesn't give you an accurate view of what Reflow should look like. People take conclusions from these tools and end up doing the wrong thing to make the document accessible.

There is no organizational awareness that these tools don't deliver. People keep on buying these tools without buying additional tools or different tools.

I hear many governance organisations who are thinking they're doing, who really are trying to do their best and then say, we bought Adobe Acrobat Pro for all our employees to create accessible PDFs. Then there's a problem with PDF readers and browsers.

This is also one of the main problems PDF is the possibilities are not used when you make an entirely accessible PDF form that conforms with WCAG and PDF/UA, which I will tell you a little bit more about later. You still cannot use the features on a smartphone, for example.

There are so many platforms, readers, browsers, and they don't give us the full advantage of an accessible PDF. That's a problem for every user. We really need a good PDF reader.

This is an example of a PDF reader that has been available where you can see my presentation in one column and just in blue. Sorry, in yellow letters on a black background with just one column. It's also responsive because PDF can do Reflow, but the tools don't allow it.

So Big Tech fails us. Their intent and their incentive is money. We have no common, no shared values, because our value is inclusiveness, accessibility, being there for everybody. And theirs is just money. We pay a lot of money for bad tools every month, and they don't deliver.

Sorry about that. Creating an accessible PDF should not be this hard. It should be way easier. It is too hard. Of course it's possible, but it's unnecessarily hard. We should innovate and diminish our dependence on Big Tech.

Also for other reasons, like our sovereign constitutional democracy, which they are messing with. But that's not a story. Better PDF readers are needed and there are alternatives on the market and we really should look beyond Big Tech. Then in the Netherlands we have the poldermodel, which doesn't help.

That's an approach to decision-making based on consensus, dialogue and compromise. There's no regulatory supervision and enforcement or as we say in Dutch, handhaving en toezicht and we are not set up to comply with standards.

A little bit of this, a little bit of that, but a little bit of this is not fully accessible. And you still exclude people. We are not made to just comply with standards.

On this image you see a woman laying on the sofa with a therapist laying next to her and the therapist says, you don't hate PDFs, you're just tired of. Well, what is she tired of?

Well, she's tired of bad content, weather HTML, PDF, not enough time, too much content, too much PDFs, the wrong file format being used, discussions, being part of team, no being overruled. Micro level decisions, just about one document and software that's not doing what it's supposed to do.

These people cannot do their job because a lot of you are probably her. It's too much, and it's not solved in the right places. PDF or HTML. I'm not for more PDFs online or bad PDFs online.

Do not use PDFs to present digital content that could and should otherwise be published as a webpage. Often a web page is a good choice. Do not publish print PDFs, not web content. Do not publish interactive PDFs with buttons on every page for navigation.

If you want that, build a website that's not an accessible PDF. It's not good content. Just stop it. Do not publish bad content. That's of course very important. Tagged PDF is important. That's the key for accessibility is at the tree structure with tags. When is PDF useful?

I know there's a lot of information on this sheet too much to read, but you can find the handout online, or you can take a picture.

Usually I don't cram my sheets like this for download references offline use so your payslip, your tax authority decisions, contracts and signing contracts, user manuals, data that needs to be stored locally sometimes reports of course not always, but could be, then PDF would be a good choice.

Need for annotation so you need to dot down some comments. Need to archive documents in a personal administration or elsewhere or when you want something really frozen in time, then you use PDF/A, which is for archive, and you can have security settings and it's everything in one box, so it can be useful.

Good use of standards. My last two slides. There are three standards that are relevant here. WCAG, which is the Web Content Accessibility Guidelines. In this image I represent WCAG as a bridge because it's about web content.

Can you see where one lane starts and the other one ends, so you can drive your truck in the right lane? Or can you read the sign on the bridge that gives you directions? Then you have PDF/UA, which is the technical standard for PDF and that's comparable with a bridge.

Isn't the slope too steep to even access the top of the bridge? If you cannot access the top of the bridge, you cannot access the content. PDF/UA is necessary for accessible PDFs. It's not a nice to have.

And PDF/A, which is the standard for archive, is like a time capsule where you lock everything into the PDF fonts, no scripts, so it can be used later on in time.

Your next steps, Anti ignorance PDF is not evil, and I hope you can share this message and also dispel myths like PDF is just for print and it's outdated. Check your content strategy. A content strategy that's formalised, gives you guidance. Use the appropriate file format for the job.

Embrace PDF/UA and so use PDF where it suits your needs better than HTML. But HTML is usually the way to go. PDF/UA simplifies discussions and is necessary for technical accessibility. Be prepared. PDF/UA should be mandatory because WCAG is not enough. It should be mandatory by law.

There are people working on that. For example in Germany it's also in the law. Diminish dependency on big tech, many reasons. But also use alternatives, plugins and innovate.

I know you all don't like PDFs, you don't love them, but I just wanted to mention, for my own peace of mind, to let you know that no PDFs were harmed in the making of this presentation. I would like to answer your questions.

Later today there will be two parallel sessions at 13:30 and 15:05 in Zaal Amathist on the ground floor. If you want more non-alternative facts, you can also follow me on LinkedIn.

[LARISSA]

Thanks, Iacobien.

[APPLAUSE]