Beginning with the EdgeHTML 17, Microsoft Edge is the first browser to support Screen Capture via the Screen Capture API. Web developers can start building on this feature today by upgrading to the Windows 10 April 2018 Update, or using one of our free virtual machines.
Screen Capture uses the new getDisplayMedia
API specified by the W3C Web Real-Time Communications Working Group The feature lets web pages capture output of a user’s display device, commonly used to broadcast a desktop for plugin-free virtual meetings or presentations. Using Media Capture, Microsoft Edge can capture all Windows applications–including including Win32 and Universal Windows Platform applications (UWP apps).
In this post, we’ll walk through how Screen Capture is implemented in Microsoft Edge, and what’s on our roadmap for future releases, as well as some best practices for developers looking to get started with this API today.
Getting started with the Screen Capture API
The getDisplayMedia
() method is the heart of the Screen Capture API. The getDisplayMedia()
call takes MediaStreamConstraints
as an optional input argument. Once the user grants permission, the getDisplayMedia() call will return a promise with a MediaStream
object representing the user-selected capture device.
The MediaStream
object will only have a MediaStreamTrack
for the captured video stream; there is no MediaStreamTrack
corresponding to a captured audio stream. The MediaStream
object can be rendered on multiple rendering targets, for example, by setting it on the srcObject
attribute of MediaElement
(e.g. video tags).
While the operation of the getDisplayMedia
API is superficially very similar to getUserMedia
, there are some important differences. To ensure users are in control of any sensitive information which may be captured, getDisplayMedia
does not allow the MediaStreamConstraints
argument to influence the selection of sources. This is different from getUserMedia
, which enables picking a specific capture device.
Our implementation of Screen Capture currently does not support the use of MediaStreamConstraints
to influence MediaStreamTrack
characteristics (such as framerate or resolution). The getSettings()
method can’t be used to obtain the type of display surface that was captured, although information such as the width, height, aspect ratio and framerate of the capture can be obtained. Within the W3C Web Real-Time Communications Working Group there is ongoing discussion of how MediaStreamConstraints
influences properties of the captured screen device, such as resolution and framerate, but consensus has not yet been reached.
User permissions
While screen capture functionality can enable a lot of exciting user and business scenarios, removing the need for additional third-party software, plugins, or manual user steps for scenarios such as conference calls and desktop screenshots, it also introduces security and privacy concerns. Explicit, opt-in user consent is a critical part of the feature.
While the W3C specification recommends some best practices, it also leaves each browser some flexibility in implementation. To balance security and privacy concerns and user experiences, our implementation requires the following:
- An HTTPS origin is required for
getDisplayMedia()
to be called. - The user is prompted to allow or deny permission to allow screen capture when
getDisplayMedia()
is called. - While the user’s chosen permissions persist, the capture picker UI will come up for each
getDisplayMedia()
call. Permissions can be managed via the site permissions UI in Microsoft Edge (in Settings or via the site info panel in the URL bar). - If a webpage calls
getDisplayMedia()
from an iframe, we will manage the screen capture device permission separately based on its own URL. This provides protection to the user in cases where the iframe is from a different domain than its parent webpage. - As noted above, we do not permit
MediaStreamConstraints
to influence the selection ofgetDisplayMedia
screen capture sources.
Sample scenarios using screen capture
Screen capture is an essential step in many scenarios, including real-time audio and video communications. Below we walk through a simple scenario introducing you to how to use the Screen Capture functionality.
Capture photo from a screen capture device
Let’s assume we have a video tag on the page and it is set to autoplay. Prior to calling navigator.getDisplayMedia
, we set up constraints and create a handleSuccess
function to wire the screen capture stream to the video tag as well as a handleError
function to log an error to the console if one occurs.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var constraints = { | |
video: true | |
}; | |
function handleSuccess(stream) { | |
video.srcObject = stream; | |
} | |
function handleError(error) { | |
console.log('navigator.getDisplayMedia error: ', error); | |
} | |
navigator.getDisplayMedia(constraints). | |
then(handleSuccess).catch(handleError); |
When navigator.getDisplayMedia
is called, the picker UI comes up and the user can select whether to share a window or a display.
The Picker UI allows the user to select whether to share the entire display, or a particular window.
While being captured, the chosen application or display will have a yellow border draw around it which is not included in the capture frame. Application windows being captured will return black frames while minimized (though they will still be enumerated in the picker); if the window is restored, rendering will resume.
If an application window includes a privacy flag (setDisplayAffinity
or isScreenCaptureEnabled
) the application is not enumerated in the picker. Application windows being captured will not include overlapping content, which is an improvement on snapshotting the entire display and cropping to window location.
What’s next for Screen Capture
Currently the MediaStream
produced by getDisplayMedia
can be consumed by the ORTC API in Microsoft Edge. To optimize encoding in screen capture scenarios, the degradationPreference
encoding parameter is used. For applications where video motion is limited (e.g. a slideshow presentation), degradationPreference
should be set to “maintain-resolution” for best results. To limit the maximum framerate that can be sent over the wire, the maxFramerate
encoding parameter can be used.
To use the MediaStream
with the WebRTC 1.0 API in Microsoft Edge, we recommend the adapter.js library, as we work towards support for getDisplayMedia
along with the WebRTC 1.0 object model in a future release.
You can get started with the Screen Capture API in Microsoft Edge today on EdgeHTML 17.17134 or higher, available in the Windows 10 April 2018 Update or through the free virtual machines on the Microsoft Edge Developer Site. Try it out and let us know what you think by reaching out to @MSEdgeDev on Twitter or submitting feedback at https://issues.microsoftedge.com!
– Angelina Gambo, Senior Program Manager, Microsoft Edge
– Bernard Aboba, Principal Architect, Skype
Source: Windows Blog
—